**What’s the problem anyway? df/dx = 0**

You are a teenager and you have just learned that there is something called differentiation but have no idea what it could be useful for. The teacher told you something about how it can be used to calculate velocity and acceleration but you are still not convinced it's a game changer. One day there is a farmer in your math book who needs help with the design of his pasture. He has 100 m of fence and wants to create a pasture with the maximum possible area. Your goal is to figure out the length and width such that the area becomes as big as possible. Since you are learning about derivatives, you have already figured out you are about to use it. After looking at some other examples, you can simply write a function describing the area with respect to one side, calculate the derivative of the function and determine the side length that makes this derivative zero.

It turns out it is 50 m. You ponder whether it's coincidental that this length is exactly half of the total fence length. At this point, you get the naive realization that this could probably be used everywhere to optimize anything. Simply write naive down the function you wish to optimize, take its derivative, and then solve for when the derivative equals zero.

Years go by, and you slowly start to realize that it isn’t that easy to just write down a function for what you want to optimize. It turns out that things have more complicated geometries than the basic shapes you happen to know the formula for. Perhaps design optimization isn’t that easy after all.

**The professor saves the day**

In comes your university professor and tells you about something called finite element method (FEM), about how you can break down complicated geometries into many tiny simple geometric shapes and that way calculate strain, stress and displacements of things. Your mind is blown. You can’t solve these equations by hand, but you understand that this poses no problem for a computer. After school, you run home and start to search the internet to discover what more is possible with this technique. That's when you stumble upon topology and topography optimizations. You see these amazing, organically looking designs that embody the true optimal shape. This, you think, is what God intended when He invented product development. What they have done is simply used the “set the derivative to zero” technique but for FEM simulations.

After struggling to find any of these godlike designs in the real world, you realize you're onto something new. You can't wait to finish university and disrupt the world.

**Enter Reality**

You finally graduate, emerging from the confines of the university into the stark reality awaiting outside. You somehow managed to land a junior position as a product developer and you are given the task to design a component to be used in a machine you never heard about before in an industry you just learned about. You ask your manager about the optimization software used, and he responds with a puzzled expression.

Your boss explains that weight isn’t even in the top 10 priorities. "There are cost constraints, manufacturability concerns, and a whole slew of other constraints that are more important than weight" he says.

He continues: “Unlike the linear simulations in your textbooks, our real-world scenarios don't neatly boil down to a single derivative. There is no derivative. There are methods for doing non-linear optimizations but they are orders of magnitude slower than gradient based optimizations”.

You are now starting to question the calculus book that presented the problem with the farmer. Was he even real, you wonder? In reality, farmers tend to place fences along their land's borders, without concern for optimizing geometrical shapes. And how expensive is the fence anyway? Probably negligible compared to all the other costs.

Many can relate to the humbling transition from academia to the stark realities of the professional world. The reality is that most often, things just have to get done, not optimized. But could it be possible to use AI to get things done?

**Fast Forward to 2024**

In recent years, there has been an explosion of AI tools, starting with image generation tools like __Midjourney __and __Dalle__. These tools can generate highly realistic images from text prompts. Is it possible to use a similar technique to generate 3D models, potentially replacing traditional CAD tools? Much of the work done in CAD tools are after all __not that complicated__.

To answer that question, we first need to clarify what exactly the purpose of CAD is anyway. While there may be many use cases beyond my knowledge, the primary purpose of CAD is to translate conceptual ideas into tangible products that can be manufactured. Historically, this process involved creating 2D drawings by hand, which engineers would then forward to the workshop. Nowadays, most workshops still require 2D drawings, but mainly as a complement to a digital 3D model.

A second important downstream application of CAD is simulations. The CAD model is often used to generate meshes for finite element simulations, which is used for optimizing the product and making sure it is feasible for production. Certainly, there are other significant, albeit less prominent, use cases, such as image rendering, among others.

Ideally, the perfect product development tool would be a machine that takes a problem description as input. It would then output detailed instructions for the optimal solution (such as G-codes), considering the specific environment, including available resources and ethical considerations. Let's examine the current AI contenders, which are poised to either replace the existing paradigm or integrate seamlessly with modern CAD software.

**Voxel based AI**

A voxel is simply an extension of pixels in 3D space. While a pixel has a fixed x,y position a voxel has a fixed x,y,z position. Since diffusion models (Dalle, Midjourney) work so well with pixel images, could it be applied here? The answer is yes, but __it__ is currently far from as good as it is with pixels. __It__ is also not possible to generate completely novel shapes from text. Rather, the models allow you to guide the generation process using either an image, another 3D model, and some text. However, even if it were possible to generate voxel models using detailed text descriptions, they suffer from the fact that they are discretized, which makes it very hard to scale them to a size where they are actually useful for downstream application.

**Point cloud AI**

__Point cloud models__ represent 3D shapes by points in space. This representation is actually already used widely in the industry since it’s the output from 3D scanners. Recent advancements in AI have made it possible to generate point clouds from text only, however, it suffers not only from the same scaling issue as voxel models but also from not representing topology properly.

**Meshes**

This is the classical 3D representation used in computer graphics and FEM. There already exists a __load __of __companies __offering __text to 3D__ mesh generators that can be directly implemented as objects in computer games or VR. However, when pushed to the limit, neural nets are not good at generating meshes since they often end up with self intersecting surfaces. Similar to Voxels and point clouds, they suffer from being discretized.

**Occupancy Network/Implicit neural representation (INR)**

This is a __new technology__ that, rather than outputting data representing a 3D shape of some kind, the output is a function that maps a 3D coordinate to a yes/bool value if there is something there. So in order to see how the model looks like one has to query the function for all the places of interest in space. This solves the discretization problem that all the previous models suffer from since you can in theory query the function for any floating point number.

**BREP Generators**

One big issue with all the methods mentioned above is that they are not parametric methods. What if you want to increase the diameter of a hole in your product? Then all you can do is to pray that the AI model can do that change for you. Boundary Representation (BREP) is the 3D format that CAD software uses. Similar to a programming language it consists of a set of instructions, functions and parameters that can be put in a certain sequence to create different 3D models. This makes it easy to change a parameter of an existing model if the generated model is not perfect. One big drawback of this method is that there are no labelled datasets, meaning that there is currently no model that allows for text input. The most prominent use case for this technique is therefore autocompletions, but even for those cases the __state of the art__ models are simply not good enough at the moment.

**Large Language Models**

All of the previously mentioned methods suffer from a huge handicap: they will always lack context. No useful product is spawned into existence from vacuum, there is always context that has to be taken into account. The optimal design will look very different depending on what material is used, what production methods, what legal restrictions it is subject to, etc. This is where LLMs shine, it can be provided with a relatively huge context and generate proposals for a design with that into consideration. The LLMs don’t have a direct way of outputting 3D models, but they can generate instructions similar to the BREP models. These instructions can thereafter be translated into a 3D model using existing CAD software. However, one big *flaw* of the current state of the art LLMs is that they are inherently bad at mapping shapes and objects into the correct place in 3D space. They can describe the shape and positions of objects relative to each other perfectly well. However, they somehow fail dramatically when trying to convert these descriptions to 3D coordinates. Try out our free LLM based text to CAD in our __lab__.

**Conclusion**

It’s obvious that no single model currently exists that can replace the CAD software we use today, much less the product developer. Standalone text to 3D models are simply not able to take enough information as input to translate ideas and context into a feasible design. A feasible design might be achievable through iterations, but this approach still falls short of the inherent flexibility and adaptability provided by parameterized BREP models.

Although useful for games, they are simply not usable for engineering purposes right now. The BREP autocompletion, while a step forward, doesn't yet have the capability to interpret text inputs, relying solely on the intuition derived from its training data. A text to BREP system would be a advancement, yet it would still require multiple iterations to refine its output, making the process less efficient than the direct intervention of a skilled designer using traditional CAD tools.

The real game-changer in this landscape could be the LLM. With its ability to process extensive context and generate text at a bandwidth far exceeding human capabilities, it holds immense potential. Bridging the current gap in spatial understanding is crucial. Enhancing the Text to 3D models' understanding of text in the context of CAD models or equipping LLMs with tools to overcome its spatial limitations could unlock unprecedented capabilities in context-aware design. __At__ Visendi, our bet is on the latter approach as the pathway to truly transformative advancements in AI-driven design.

## Comments