We have a yummy dataset with recipes written by multiple Reddit users, however most of the information is free text as the title or description of a post. Let's find out how we can very easily load the dataset, extract some features and analyze it using features from OpenAI large language model within Embedded Python and the Langchain framework.
Loading the dataset
First things first, we need to load the dataset or can we just connect to it?
There are different ways you can achieve this: for instance CSV Record Mapper you can use in an interoperability production or even nice OpenExchange appl

.png)

.png)

.png)



.png)
.png)

.png)