Innovating for Generative Elegance

Article

Alex Woodhead · Jun 19 3m read

Open Exchange

#Artificial Intelligence (AI) #Generative AI (GenAI) #Machine Learning (ML) #Natural Language Processing #InterSystems IRIS

Audience

Those curious in exploring new GenerativeAI usecases.

Shares thoughts and rationale when training generative AI for pattern matching.

Challenge 1 - Simple but no simpler

A developer aspires to conceive an elegant solution to requirements.
Pattern matches ( like regular expressions ) can be solved for in many ways. Which one is the better code solution?
Can an AI postulate an elegant pattern match solution for a range of simple-to-complex data samples?

Consider the three string values:

"AA"
"BB"
"CC"

The expression: "2 Alphabetic characters" matches all these values and other intuitively similar values in general flexible way.

Alternatively the expression: "AA" or "BB" or "CC" would be a very specific way to match only these values:

Another way to solve would be: "A" or "B" or "C", twice over.

Challenge 2 - Incomplete sample

Pattern problems rarely have every example specified.
A performative AI needs to accept a limited incomplete sample of data rows and postulate a reasonable pattern match expression.
A Turing goal would be to meet parity with human inference for a pattern for representative but incomplete data.

Better quality of sample processing is a higher prioity than expanding the token window for larger sample sizes.

Challenge 3 - Leverage repeating sequences

Extending the previous example, to also include single character values

This seems more elegant than specifying ALL the possible values long-hand.

 if test?1(1"A",1"B",1"C",1"AA",1"BB",1"CC")

Challenge 4 - Delimited data bias

A common need beyond generalized patterns is to solve for delimited data. For example a random phone number format

213-5729-57986

Could be solved by expression:
3 Numeric, dash, 4 numeric, dash, 4 numeric

 if test?3N1"-"4N1"-"4N

This can be normalized with repeat sequence to:

 if test?3N2(1"-"4)

Essentially this means having a preference for explicitly specifying a delimiter for example "-" instead of generalizing delimiters as punctuation characters. So generated output should prefer to avoid over-generalization for example:

 if test?3N1P4N1P4N

Challenge 5 - Repeating sequences

Consider formatted numbers with common prefix codes.

The AI model detects three common sequences across the values and biases the solution to reflect an interest in this feature:

On this occasion the AI has decided to generate a superfluous "13" string match.

Howeve, as indicated by the tool, the pattern will match all the values provided.

The pattern can easily be adjusted in the free text description and regenerated.

Inference Speed

Workbench AI assistance with qualified partial success can be an accelerator to implementation.
Above a complexity threshold an AI assistant can deduce proposals faster than manual analysis.
Consider the following AI inference attempt with qualified partial success:

The AI assistant uses as many rows of data that it can fit into its token context window for processing, skipping excess data rows.
The number of rows is quantified in the generated output showing how data was truncated for inference.
This can be useful to elevate preferred data rows back into the context window for refined reprocessing.

Training Effort

Targeting Nvidia Cuda A10 GPU on Huggingface.
Supervised model training.

Stage	Continuous GPU training
Prototype base dataset	4 days
Main dataset	13 days
Second refined dataset	2 days

Conclusion

Single-shot generative inference with constrained token-size can usefully approach discrete code solution elegance in absence of chain-of-thought processing by curating subject expert bias into base training data.
AI assistants can participate in iterative solution workflows.

Explore more

Get hands on and explore the technology demo currently hosted via Huggingface.
The cog symbol on buttons in demo indicate where AI generation is being employed

Demo written for English, French and Spanish audience.

Alex Woodhead · Jul 26

Through working on this and other projects I have really come to appreciate all the heavy lifting that the Gradio framework does. Most of my time is simply focused on requirement, application logic and model development goals.
The Gradio UI framework automatically generates functional machine callable APIs. By using HuggingFace Spaces to host Gradio Apps+ GPU Model remotely, the API is directly available to be securely consumed from both my local UnitTests scripts and benchmark scripts.
All without any redesign or parallel API effort:
See: Gradio view-api-page

Many legacy systems can interoperate with HTTP(S).
So from MCP, the Python / Node Javascript client or simply using CURL, the API inference service is available.

Gradio conveniently provides work queues giving a neat configuration lever for balancing contention for serverside business logic that can happily run for example 10 parallel requests, from the GPU inference requests.
On HuggingFace you can implement two-tier (separate CPU app and GPU Model inference instances) in a way that passes a session cookie from the client back to the GPU Model.
This adds support for per client request queuing to throttle individual clients from affecting the experience of all the other users.

This demo is "Running on Zero". This is the "stateless" infrastructure Beta offering from HuggingFace.
ie: The Gradio app is transparently hosted without managing a specific named instance type, and by annotating GPU functions, the inference is offloaded transparently to powerful cloudy GPUs.
Updating application settings for model or files, automatically facilitates the docker image rebuild ( if necessary ) and relaunches the API and invalidates and reloads the model ( if needed ).

Another aspect of Gradio display framework is being able to embed a new Gradio app on client-side to existing web application, instead of implementing a direct serverside integration.

See: Gradio embedding-hosted-spaces

Gradio has been great for local iteration development, in that it can be launched to monitor its own source files and automatically reload the existing application in open web-browser after a source file is (re)saved.

In summary use of Gradio ( with Python ) has imbued some quite high expectations of what is achievable from a single application framework, offloading the responsibilities of on-demand infrastructure ( via HuggingFace ), minimal user interface code and automatic API, to just simply focus on business domain value outputs and iterating faster.

A good yardstick for feedback on other frameworks.

0 0