3D-BIO-architect: AI agent with multimodal input and output.
tags
Tutorial
Teaching
ComfyUI
UCL
category
Knowledge
icon
password
URL
1. Structure of 3D Bio-Architect
The 3D bio-architecture AI agent aims to generate structural design outputs (hypotheses, visual descriptions, and image visualizations) by learning from specific types of biological intelligence through text and image references.
setting the model parameters. Be carful about the model seed, by default it will update after every round running, meaning it will keep generate now output even with same input.
3.2 LLM processing group
The LLM “brain” to proceed the COT (chain of thought). The output from previous model will feed into the next node as the user prompt.
When you change the image, remember to keep the creature name here to fit to your image.
You could also change the design target building type in the second “brain”.
3.3 Image generation group.
You could change the image generation parameters like the normal workflow.
💡
If you want to test the same image prompt input, you could do that in other individual image generation workflow, rather than running the whole 3D bio-architect. Since running the LLM workflow will consume token fee.
4.Further Thinking
4.1 Tips to improve the output quality
💡
Finetune the prompt (with help of AI)
Change to more powerful model (gpt-4o, Claude, Deepseek).
Upgrade the image generation part:
Using enhance lora.
Use better base model (finetune sdxl or flux).
Change the structure of chain of thought.
4.2 More complex and powerful AI agent
💡
Could we Apply that with other image input control approach? (ip-adapter, for example).
Could we use AI agent to criticize its own output and fulfill self-improving?
Loading...
目录
0%
Simon Shengyu Meng
AI artist driven by curiosity, cross-disciplinary researcher, PhD candidate, science communication blogger.