Renewal·마흔의 생활코딩

Agentic | DEVIKA, the Open-Source AI Software Engineer

May 1, 2024·5 min read

cover image

The Agentic Concept Series
- Agentic Chunking LangChain RAG
? DEVIKA, the AI Software Engineer
- CrewAI, an AI agent orchestration framework

...A practice run I did a month? two months ago... I kept putting it off and now I'm finally posting it.. but...meanwhile Devin went and launched? ;D ㅋㅋㅋ

The conclusion first?!

Devin, Open Devin, and DEVIKA — none of them should be approached merely as 'AI software engineers.' The point is the agent. You should treat them as a case study of how various LLMs can be agentically MOE-stitched, run through the hands-on yourself, and explore how to apply this from your own vantage point.

DEVIKA

Devin (AI software engineer) officially launched after this 'open source' came out, and like Devin its main focus is code generation.
A similar model is Open Devin.

Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.

GitHub - stitionai/devika: Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break th

Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective...

github.com

Key features

Supports Claude 3, GPT-4, GPT-3.5, and local LLMs via Ollama
Advanced AI planning and reasoning capabilities
Contextual keyword extraction for focused research
Seamless web browsing and information gathering
Code writing in multiple programming languages
Dynamic agent state tracking and visualization
Natural-language interaction via the chat interface
Project-based organization and management
An extensible architecture for adding new features and integrations

Architecture

Agent Core:
Orchestrates the overall AI planning, reasoning and execution process. Communicates with various sub-agents.
Orchestrates the overall AI planning, reasoning, and execution process. Communicates with the various sub-agents.
Agents:
Specialized sub-agents that handle specific tasks like planning, research, coding, patching, reporting etc.
Specialized sub-agents that handle specific tasks such as planning, research, coding, patching, reporting, and so on.
Language Models:
Leverages large language models (LLMs) like Claude, GPT-4, GPT-3 for natural language understanding and generation.
Leverages large language models (LLMs) like Claude, GPT-4, and GPT-3 for natural language understanding and generation.
Browser Interaction:
Enables web browsing, information gathering, and interaction with web elements.
Enables web browsing, information gathering, and interaction with web elements.
Project Management:
Handles organization and persistence of project-related data.
Handles the organization and persistence of project-related data.
Agent State Management:
Tracks and persists the dynamic state of the AI agent across interactions.
Tracks and persists the dynamic state of the AI agent across interactions.
Services:
Integrations with external services like GitHub, Netlify for enhanced capabilities.
Integrates with external services such as GitHub and Netlify for enhanced capabilities.
Utilities:
Supporting modules for configuration, logging, vector search, PDF generation etc.
Supporting modules for configuration, logging, vector search, PDF generation, and so on.

Agents

Devika's cognitive abilities are powered by a collection of specialized sub-agents. Each agent is implemented as a separate Python class and communicates with the underlying LLM through a prompt template defined in Jinja2* format.
*Jinja2: a templating engine for Python. It has full Unicode support, an integrated optional sandboxed execution environment, and a BSD license that allows the software to be freely used, modified, and distributed.

Planner : generates a step-by-step plan
Researcher : extracts search keywords — gathers additional context from the user
Coder : generates code — runs validation
Action : maps the prompt to a matching action keyword
Runner : executes code in a sandboxed environment — streams output in real time
Feature : implements features — performs incremental testing
Patcher : debugs based on error messages — identifies the cause — proposes fixes
Reporter : generates a report including design — instructions — APIs
Decision : maps to a specific feature such as browser interaction or git clone — runs the function with the provided arguments

Hands-on

1. Setting up the development environment

Open LLM: install and run Ollama
Install uv -> download. installs the Python package manager; replaces the pip, pip-tools, and virtualenv commands
Install bun -> download for the JavaScript runtime

# On macOS and Linux.
curl -LsSf <https://astral.sh/uv/install.sh> | sh

# On Windows.
powershell -c "irm <https://astral.sh/uv/install.ps1> | iex"

# With pip.
pip install uv

2. Pull the DEVIKA code from GitHub

1. git clone <https://github.com/stitionai/devika.git>
2. cd devika #해당 폴더로 이동

3. Set up the virtual environment: uv venv, pip

1) Create

# Create a virtual environment at .venv.
uv venv

2) Activate

# On macOS and Linux.
source .venv/bin/activate

# On Windows.
.venv\\Scripts\\activate

3) Install packages

# On macOS and Linux.
uv pip install -r requirements.txt


playwright install --with-deps # installs browsers in playwright (and their deps) if required

4. Prepare the necessary API keys

1) Search
- Bing: https://www.microsoft.com/en-us/bing/

2) LLM
- Claude: https://console.anthropic.com/setting

3) Filling in the API keys
- Devika/config.toml

5. Turn ON

1) Run the server

python3 devika.py

2) Run the client

cd ui/
bun install
bun run dev

6. Run

1) Go to http://127.0.0.1:3000

2) Pick the model

3) Configure the search engine
- This is one of the main tools the AI engineer uses to research and share related information with the user.
- For reference, DuckDuckGo doesn't require a separate API key.

4) Specify a new project
- The code written by the AI engineer is saved in a folder named after that project.

4) Write the prompt

5) Process and result

1) Real-time review
- Inside the 'inner browser area', the AI engineer shares web search results in between research steps and reviews its (the AI's) own research findings and plans
2) Code generation result
- /[project name]/data/projects : under this folder, in step 3 a new folder is created using the new project name and the corresponding code is generated inside it
3) Real-time code review
- You can also write messages right inside the code the AI engineer wrote and communicate with it directly in real time!
- A file generated when I asked it to write up how to use the code

Review

I really like the interface. What I find personally striking is the inter-agent workflow that lets a person review and give feedback in between steps, not just on the final result.
It's still a bit early to post the actual outputs. Like with human work, the code-generation result is not the AI engineer's issue per se. It's a coordination problem between the two parties communicating. It's already shaping up into a useful tool. A lot of the examples around tackle Steam games... but personally, since that's not really my use case, I have been testing things that might apply to daily life, planning work, or studying coding. If something good comes out of it I'll post an update.

This English version was translated by Claude.

#agentic 사례 #ai 엔지니어 #Devika #devin #Moe #open devin #데비카 #데빈

Written by

친절한 찰쓰씨

Pleasant Charles — UI/UX researcher at AIT. Keeping notes on design, planning, and slow days here since 2010.

Keep reading

Renewal