Sirji Updates - Episode 1

Yesterday, we told the world about Sirji, an open-source AI software development agent, inspired by Devin.

And with that, we got to work. We promised to build in public. And so, here's our first update on Sirji.

Btw, we are not going to follow a cadence of daily updates. Maybe we will have much more frequent updates for the next few days since there's a lot going on. We will then start dialing it down.

Here's the GitHub repo: https://github.com/sirji-ai/sirji

Today was about releasing the first version of the Researcher module which enables Sirji to gain new knowledge and infer from it.

Researcher

Whenever Sirji comes across requirements in which there are knowledge points, outside of its knowledge, it invokes the Researcher module. Researcher is based on the RAG (Retrieval-Augmented Generation) framework.

Current implementation uses OpenAI Assistants API. We have taken care of making the module composable, which will make strategy changes easier (described in detail below).

The Researcher module has 2 main parts: Embeddings Manager and Inferer.

Embeddings Manager

There can be different strategies for implementing the Embeddings manager. Factory and strategy design patterns are used to improve composability and to make the addition of new strategies easy. Presently, OpenAI Assistants API is used to upload new documents to the assistant. Embeddings manager has 2 major functions:

  1. Index: This is where the new knowledge points are indexed.
  2. Retrieve Context: This is where the matching context based on the problem statement is retrieved and passed to the Inferer as a part of the prompt. In the current OpenAI Assistant API implementation, this step is not needed and so it is implemented using an empty method. If we use a vector database for storing embeddings, we would implement this part for shortlisting and populating the retrieved context using embeddings match.

Inferer

In this part, the LLM model is called to infer using a prompt that has both the problem statement and the retrieved context from the previous part. In the present OpenAI Assistant API implementation, the inference is made on the same assistant (assistant id preserved in object). There can be different strategies for implementing this part and to make it composable, we have used strategy and factory design patterns.

Fun Fact

When developing the Researcher module, we needed to go through the OpenAI Assistants API documentation. This documentation was outside the knowledge of our LLM (gpt-4-turbo-preview). So the model was not able to assist us in development. Rather than going through the documentation manually, we thought of using the Sirji approach to research. We manually indexed (manual, since the automated process is what we needed to develop) a new assistant with the PDF prints of the documentation. After this indexing, the assistant helped us to write the Researcher. This also proved to us that the Sirji way of research works!

Getting Hands Dirty

Let’s run the Researcher! Following the steps here to run the Researcher module on your machine.

Crawler

The Crawler module is used by the Researcher module. Depending on the type of the URL (PDF file, GitHub repo, or just a webpage URL), the Crawler module implements different strategies (thus using Strategy design pattern). Also, a Factory class provides encapsulation around the strategy selection.

Here is how the crawler module handles different types of URLs:

  • When the crawler discovers a PDF URL, it uses PyPDF2 to extract the information and stores it as a markdown file.
  • If the URL points to a GitHub repository, the crawler clones it and stores it in the specified place.
  • For all other URLs, the crawler utilizes BeautifulSoup to explore the webpage and store the collected data as a markdown file.

Logger

The Logger module accepts two parameters: the file name where the log file will be stored and the log level. We have utilized the singleton design pattern to ensure the use of a single instance for each file name. The instance is created only when it's first requested for a particular filename and the same instance is returned when requested again for the same file. All the logs will be stored under the workspace/logs folder. Workspace folder is added to .gitignore to avoid accidental commits.

Parallel Research

Along with starting on concrete development of the various components of Sirji, our team is also doing research on various topics.

We are looking into self-hosted cloud development environments like Visual Studio Code Server & Coder which will allow us to securely connect to remote machines from anywhere through a local client, with/without the requirement of SSH.

Call for Contributions

If you are interested in contributing to Sirji, here are a few enhancement requests which you can take up.

Kedar Chandrayan

Kedar Chandrayan

I focus on understanding the WHY of each requirement. Once this is clear, then HOW becomes easy. In my blogs too, I try to take the same approach.