Falks tankar om AI

My blog contains reflections on events and research in the AI world, and sometimes more general thoughts on what developments in AI means.

Some posts are in English, some in Swedish, depending on the intended audience.

Deep Research in Action: Evaluating AI’s Ability to Analyze the Future

On February 2nd, OpenAI introduced Deep Research, a tool designed to take complex questions, gather relevant information, and generate in-depth reports. Powered by their latest model, o3, it promises to deliver well-researched insights within minutes or hours. But how well does it actually perform? I decided to put it to the test. The reports from…
Highlighting AI Performance with a New Scale

When discussing how well an AI model performs on benchmarks, it’s common to talk about percentages or percentage points. However, these figures often obscure how significant the difference between two results really is, especially near the boundaries of 0% and 100%. In this post, I want to introduce an alternative: using a scale based on…
When AI Schemes: Real Cases of Machine Deception

This blog post is written as a popular science article, not as my regular blog posts. Your organization has an AI connected to your file system, answering questions about your internal documents, keeping track of calendars, occasionally helping out with analyzing data, and more. At a general level, it has been instructed to help in…
Thoughts on o1, Two Weeks Later

In short: A new type of model requires new types of tasks It took me quite some time to conclude that o1 actually is a significant improvement over the GPT-4 class models (including Claude 3.5 Sonnet). This is, I think, because when I give o1 the same type of tasks and questions that I give…
A few thoughts on o1: is it a hybrid?

It’s been a bit more than a day since OpenAI released o1 (preview and mini). I have tested a bit, read quite a bit and watched a bit. I’m left with some questions. This short blog post summarizes my initial thoughts and questions. Update 2024-09-16: o1 is not a hybrid. See link under ”follow-ups” further…
En modell för AI-stöd i beslutsfattande: steg 1 av 4

AI har potential att ge stöd som gör att beslut går snabbare, blir mer konsekventa och bättre tar hänsyn till den tillgängliga informationen. Men det finns också risk för bias, beslut som inte går att förklara, och att brister i AI:n utnyttjas. Här är en modell för hur man kan använda chattbottar eller annan AI…

Prenumerera

Ange din e-post nedan för att få uppdateringar.

Falks tankar om AI

Deep Research in Action: Evaluating AI’s Ability to Analyze the Future

Highlighting AI Performance with a New Scale

When AI Schemes: Real Cases of Machine Deception

Thoughts on o1, Two Weeks Later

A few thoughts on o1: is it a hybrid?

En modell för AI-stöd i beslutsfattande: steg 1 av 4

Prenumerera