AI-agenter – Falk AI

Deep Research in Action: Evaluating AI’s Ability to Analyze the Future

Feb 7, 2025

—

by

On February 2nd, OpenAI introduced Deep Research, a tool designed to take complex questions, gather relevant information, and generate in-depth reports. Powered by their latest model, o3, it promises to deliver well-researched insights within minutes or hours. But how well does it actually perform? I decided to put it to the test. The reports from…

Thoughts on o1, Two Weeks Later

Sep 27, 2024

—

by

Johan Falk

in In English

In short: A new type of model requires new types of tasks It took me quite some time to conclude that o1 actually is a significant improvement over the GPT-4 class models (including Claude 3.5 Sonnet). This is, I think, because when I give o1 the same type of tasks and questions that I give…

A few thoughts on o1: is it a hybrid?

Sep 13, 2024

—

by

Johan Falk

in In English

It’s been a bit more than a day since OpenAI released o1 (preview and mini). I have tested a bit, read quite a bit and watched a bit. I’m left with some questions. This short blog post summarizes my initial thoughts and questions. Update 2024-09-16: o1 is not a hybrid. See link under ”follow-ups” further…

Teams of AI Agents Can Find and Exploit New Cyber Vulnerabilities

Jun 16, 2024

—

by

Johan Falk

in AI-risker, In English

New research shows that AIs can be used to find and exploit previously unknown cyber vulnerabilities. While it was already known that AI could generate code to exploit known vulnerabilities based on descriptions, this is the first documented instance of AI discovering new vulnerabilities. The research, conducted by the University of Illinois Urbana-Champaign and funded…

Etikett: AI-agenter

Deep Research in Action: Evaluating AI’s Ability to Analyze the Future

Thoughts on o1, Two Weeks Later

A few thoughts on o1: is it a hybrid?

Teams of AI Agents Can Find and Exploit New Cyber Vulnerabilities