machine learning

Customizing LLMs: Full Fine-Tuning

By Jay Lagare on Monday, May 19, 2025

Full Fine-Tuning the traditional method for adapting large language models (LLMs), involving updates to all of the model’s parameters. While more resource-intensive than parameter-efficient fine-tuning (PEFT) and other methods, it allows for deeper and more comprehensive customization, especially when adapting to significantly different tasks or domains.

Technology

Customizing LLMs: Retrieval Augmented Generation

By Jay Lagare on Monday, March 17, 2025

Retrieval Augmented Generation or RAG is a technique that enables generative artificial intelligence (Gen AI) models to retrieve and incorporate new information. It modifies interactions with a large language model (LLM) so that the model responds to user queries with reference to a specified set of documents, using this information to supplement information from its pre-existing training data. This allows LLMs to use domain-specific and/or updated information (Wikipedia) .

Technology

Transformer²: Self-Adaptive LLMs

By Jay Lagare on Sunday, January 19, 2025

LLMs are typically developed through a process of training on vast amounts of data, the corpus. This costs a lot of time and money. ChatGPT-3, for example, cost $10M. This cost going down but it’s remains expensive. You can avoid this cost for specific use cases by “fine-tuning” a model with specific data or you can augment their prompts with reference data as in Retrieval Augmented Generation or RAG. The next stage in LLM development are models that update/evolve through time. This is what’s discussed in Sakana AI’s paper Transformer²: Self-Adaptive LLMs.

Technology

Training YOLO to Detect License Plates

By Jay Lagare on Friday, January 3, 2025

The nice thing about ChatGPT and similar systems is that the complexity of AI/ML functionality is hidden behind a friendly natural language interface. This makes it easily reachable to the masses. But behind this easy to use facade is a lot of advanced functionality that involve a sequence of data processing steps called a pipeline. An AI-powered business card reader, for example, would first detect text and then recognize the individual letters within the context of the words they belong to. A license plate reader would be similar. Detection is an important process that you often need in your AI/ML projects. And that’s why we will be looking at YOLO.

Technology

Running Your Own LLM With Ollama

By Jay Lagare on Friday, December 27, 2024

Leveraging the capabilities of Large Language Models (LLM) using APIs such as the OpenAI APIs is an easy way to add intelligence and advanced functionality to your applications. However, token costs add up and they can get quite expensive. Then there’s the nagging question of privacy and security. Finally, you’re limited in your ability to experiment and customize. But if you have a powerful machine with a GPU or two sitting around, wouldn’t it be great to utilize it for running one of those open source LLMs? Here’s how you can do it.

Tag Archives: machine learning

Customizing LLMs: Full Fine-Tuning

Customizing LLMs: Retrieval Augmented Generation

Transformer²: Self-Adaptive LLMs

Training YOLO to Detect License Plates

Running Your Own LLM With Ollama