The hitchhiker’s guide to Jupyter

For the last few weeks, I have been teaching myself AI. I have twenty-seven years’ experience in IT, but much less in AI. I am familiar with LLM tools like ChatGPT and Bard/Gemini and how to engineer prompts to get meaningful responses. I’ve even had a go at using them to create applications in PHP and Python. I’ve use DALL-E to create images (like the one for this post), but something was missing. The engineer in me needed to know how they work under the hood. Along this journey (so far) I’ve learned a few things, so I’ve written them here in case they are of use to anyone else.

1.

The bar to entry is very low and very cheap. I started with the free Fast.ai course ‘Practical Deep Learning for Coders’. I have no affiliation with them or reward, but I can wholeheartedly recommend it. It is easy to follow and breaks it down to the fundamentals without using obtuse language or jargon. There are also excellent free resources from AWS, Amazon, and Google Cloud, each with a spin on their respective offerings.

If someone had told me five weeks ago that a neural network was alien technology leaked from Roswell, I would have had no grounds to doubt them. In these last few weeks, I have gone from no practical experience to cleaning data and using machine learning algorithms and deploying neural networks in tools like Kaggle and Collab, to do my own data analysis and create AI models. I haven’t even needed my own graphics card.

2.

It’s possible to fully automate the entire lifecycle from ingesting data to publishing applications that use machine learning models. The level of efficiency that can be achieved is incredible. Having fully automated pipelines to iterate development of models, allows all of this to be achieved in hours or days. MLOps is an extension of DevOps to insert the iterative creation of the models into the DevOps development and delivery pipelines. With the right pipeline architecture, it can be seamless and pain-free, allowing you to focus on the context of your data models rather than complicated infrastructures to support it.

3.

It’s all about the data you put in to train your model. If you are doing something generic like recognising cats from dogs, handwriting recognition, or sentiment analysis, then there are many public datasets available to train models to get you started. It may be all you need. If you are doing something bespoke, then you will want to prepare your own data. This seems to be the most important step. It seems vital that you have people who understand the tools available and the data along with its context. Having that relationship is key. Understanding the data appears to be vital. Engineering the data can be the difference between a reliable model and an unusable one.

When we talk about ethics in AI, I think this is the step that avoids that problem. Having people that understand the problem you are trying to solve and how the data sets you are training with interact is vital to developing a model that addresses your needs.

4.

It’s all about statistics. We really, really need to stop anthropomorphising AI models. Yes, it’s amazing that an LLM can respond with meaningful text to a free-form question. Yes, it’s amazing that a model can generate an image by ‘understanding’ the sentiment in the requirement. It’s amazing is that we seem to be able to represent human knowledge using statistical graphs. It might even tell us something fundamental about the workings of our own minds.

But let’s not forget that you are talking to a Docker container somewhere in the same way you do when browsing a website. You put data in (usually text), it analyses the data and makes a prediction, then reports the results that correlate. It can only make a prediction based on what it has seen before. It is not sentient. It’s not even a ‘thing’. It’s lots of small things that are good at correlating the relationships in the words you put in with the words you probably want back. With generous helpings of confirmation bias on our part.

There is another ethical alarm bell that needs ringing here. As an AI model is just making statistical predictions, we need to be very careful about giving it agency over taking actions, like driving cars, working machinery, identifying (or not) illnesses, or approving applications. The choice to directly attach the model to an outcome is very much a human one. We choose to code that into our applications. We should be very open about the level of confidence of the predictions and when/what those predictions are allowed to trigger.

Lastly, as extraordinary as LLMs are, they are only possible due to the massive amounts of data they consume. That data is the cataloguing and detailing all our knowledge and experience into the Internet. I don’t mean the pouting selfies and kitten memes (although they might contribute something I guess), but the sites like Wikipedia and the people behind them like Jimmy Wales and all the countless people that create entries and maintain them. That is a truly remarkable achievement.

Methods works with many organisations to engineer data, create applications and DevOps/MLOps automated deployment pipelines. If you have an AI requirement you need help with, you can get more information and contact details here.

Artificial Intelligence

The hitchhiker’s guide to Jupyter (part 1/n)

1.

2.

3.

4.

Get in touch