AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, data analysis, visualization, and report writing. Perfect for researchers and data scientists seeking to enhance their workflow and productivity.
This is an advanced AI-powered research assistant system that utilizes multiple specialized agents to assist in tasks such as data analysis, visualization, and report generation. The system employs LangChain, OpenAI’s GPT models, and LangGraph to handle complex research processes, integrating diverse AI architectures for optimal performance.
The integration of a dedicated Note Taker agent sets this system apart from traditional data analysis pipelines. By maintaining a concise yet comprehensive record of the project’s state, the system can:
git clone https://github.com/starpig1129/ai-data-analysis-MulitAgent.git
conda create -n data_assistant python=3.10
conda activate data_assistant
pip install -r requirements.txt
.env Example
to .env
and fill all the values# Your data storage path(required)
DATA_STORAGE_PATH =./data_storage/
# Anaconda installation path(required)
CONDA_PATH = /home/user/anaconda3
# Conda environment name(required)
CONDA_ENV = envname
# ChromeDriver executable path(required)
CHROMEDRIVER_PATH =./chromedriver-linux64/chromedriver
# Firecrawl API key (optional)
# Note: If this key is missing, query capabilities may be reduced
FIRECRAWL_API_KEY = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
# OpenAI API key (required)
# Warning: This key is essential; the program will not run without it
OPENAI_API_KEY = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
# LangChain API key (optional)
# Used for monitoring the processing
LANGCHAIN_API_KEY = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Start Jupyter Notebook:
Set YourDataName.csv in data_storage
Open the main.ipynb
file.
Run all cells to initialize the system and create the workflow.
In the last cell, you can customize the research task by modifying the userInput
variable.
Run the final few cells to execute the research process and view the results.
You can also run the system directly using main.py:
Place your data file (e.g., YourDataName.csv) in the data_storage directory
Run the script:
python main.py
user_input = '''
datapath:YourDataName.csv
Use machine learning to perform data analysis and write complete graphical reports
'''
hypothesis_agent
: Generates research hypothesesprocess_agent
: Supervises the entire research processvisualization_agent
: Creates data visualizationscode_agent
: Writes data analysis codesearcher_agent
: Conducts literature and web searchesreport_agent
: Writes research reportsquality_review_agent
: Performs quality reviewsnote_agent
: Records the research processThe system uses LangGraph to create a state graph that manages the entire research process. The workflow includes the following steps:
You can customize the system behavior by modifying the agent creation and workflow definition in main.ipynb
.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.
Here are some of my other notable projects:
ShareLMAPI is a local language model sharing API that uses FastAPI to provide interfaces, allowing different programs or device to share the same local model, thereby reducing resource consumption. It supports streaming generation and various model configuration methods.
A powerful Discord bot based on multi-modal Large Language Models (LLM), designed to interact with users through natural language.
It combines advanced AI capabilities with practical features, offering a rich experience for Discord communities.