DIY AI: Running Models on a Gaming Laptop for Beginners!

When DeepSeek AI burst onto the scene a week or two ago, it shook up the industry by proving that large language models can be made more efficient – in fact it’s possible to get the full DeepSeek model running on hardware that a mere mortal could acquire with a few thousand bucks. This shift raises an interesting question—can useful AI models run locally on consumer-grade computers now without relying on cloud-based data centers?

In my latest video, we take a look at running some “distilled” open source versions of DeepSeek and Meta’s Llama large language models. I’m surprised how far the quality of locally has come in such a short period of time.

To find out, I tested a distilled version of the DeepSeek model on a Lenovo Legion 5 laptop, which is equipped with an Nvidia 3070 GPU and 8GB of VRAM. The goal was to see if local AI could generate useful results at a reasonable speed.

The setup process was straightforward. After downloading and installing Nvidia’s CUDA toolkit to enable GPU acceleration. I then installed Ollama which is a command line interface for many of the available models. From there, it was just a matter of selecting and downloading an appropriate AI model. Since the full DeepSeek model requires an impractical 404GB of memory, I opted for the distilled 8B version, which uses 4.9GB of video memory.

With everything in place, I launched the model and checked that it was using the GPU correctly. The first test was a basic interaction in the command line. The DeepSeek model responded quickly and even displayed its thought process before generating a reply, which is a unique feature compared to traditional locally hosted chatbots. Performance-wise, it was surprisingly snappy for a locally run AI.

To gauge the model’s practical utility, I compared it to Meta’s open-source Llama model, selecting a similarly sized 8B variant. Performance between the two was comparable in terms of speed, but the responses varied. While DeepSeek’s output was structured and fairly coherent, Llama’s responses felt more refined in certain cases.

To take things further, I integrated Open WebUI, which provides a ChatGPT-style interface for easier interaction. This required installing Docker, but once set up, it significantly improved usability.

Next, I tested both models with a programming task—creating a simple Space Invaders game in a single HTML file. DeepSeek struggled, generating a mix of JavaScript and Python code that didn’t function correctly. Even when prompted differently, the results were inconsistent. The larger 14B version of DeepSeek running on my more powerful gaming PC did slightly better but still failed to produce a playable game. The Llama model performed marginally better, generating a somewhat functional version, but it was still far from the quality produced by cloud-based AI models like ChatGPT, which created a polished and working game on the first attempt.

For a different type of challenge, I had the models generate a blog post based on a video transcript. Initially, DeepSeek only provided an outline instead of a full narrative. After refining the prompt, it did produce something usable, though still less polished than ChatGPT’s output. Llama performed slightly better in this task, generating a clearer and more structured narrative after a nudge to get it out of its outlining mindset.

While local AI models aren’t yet on par with their cloud-based counterparts, the rapid improvements in efficiency suggest that practical, high-quality AI could soon run on everyday devices. Now that DeepSeek is pushing the industry to focus on optimization, it’s likely that smaller, more specialized models will become increasingly viable for local use.

For now, running AI on consumer hardware remains a work in progress. It’s come considerably far from where it was just a year ago, so it’ll be exciting to see what happens next.

Plaud AI NotePin Review

I recently got my hands on the NotePin by Plaud AI, a compact and wearable voice recorder with a robust set of AI tools attached through its accompanying mobile app. Plaud’s value-add is that they’ve simplified the process of generating transcriptions (complete with speaker detection) along with AI generated summaries.

You can see it in action in my latest review.

The NotePin is priced at $169 (compensated affiliate link) with an additional $79 per year subscription for the “Pro Plan” that includes additional monthly transcription minutes and additional summarization templates.

The free plan, however, is still quite functional, offering 300 minutes (5 hours) of transcription per month along with the summaries of those transcriptions. The Pro Plan comes with 1,200 monthly minutes (or 20 hours) of transcription time.

All of its AI magic happens in the cloud. The NotePin itself is just an audio recorder with 64 GB of storage and enough battery life to run for well over 10 hours between charges. It’s small, lightweight, and comes with accessories to wear it on your wrist, neck, or clipped to your clothes.

One of the things I would have liked to see is a clearer indication that the device is recording. The small red light that turns on is easy to miss, especially when it’s placed on a desk. That said, the recording process is straightforward: press down on the center of the NotePine to start and stop recording, with some haptic feedback to confirm the action.

The Plaud App handles all of the file management and transcription. The device connects via Bluetooth, and while that’s functional, transferring files takes time—an hour-long recording might take five to ten minutes to fully transfer. There’s an option to switch to Wi-Fi mode to speed this up, but it’s not on by default. Once a recording is transferred, you can either keep it as an audio file or send it to the cloud for transcription and summarization.

I tested this at a recent school board meeting, where I was surprised at how well it picked up voices across a large room. After uploading the audio to the app, the transcription process was smooth.

It labels the speakers, but you need to manually assign names to the voices in each session. It unfortunately doesn’t retain the voice prints of speakers that have been identified in prior sessions, so speakers need to be labeled every time. The app doesn’t always differentiate between speakers accurately, especially when they’re far from the microphone, but overall, the transcription quality was impressive.

What I found most interesting was the summary feature. The app generates a concise breakdown of the meeting, highlighting key points and action items. You can also adjust the summary format based on the type of meeting. The summary was mostly accurate, though there were a few minor mistakes. But for anyone looking to quickly capture the essence of a discussion without diving deep into the details, I found it to be quite effective. The minutes can be exported into a number of popular formats like Word, PDF and Markdown.

Another useful feature is that you can upload audio from other sources into the app for transcription, meaning you’re not limited to recordings made on the Note Pin itself.

If you don’t exceed the free five hours of transcription per month, I found you won’t need to pay anything extra, though that could change in the future. Many companies I’ve covered in the past discover that a robust set of free server-side features are often hard to sustain over the long term.

If you’re in need of a quick, easy, and compact tool for turning meeting recordings into transcripts and summaries without much hassle, this could be a good fit. It’s not doing anything that you couldn’t do yourself with free transcription tools and services like ChatGPT, but I like the turnkey simplicity that Plaud has put together along with an elegant and simple piece of hardware.

Disclosure: Plaud.AI provided the NotePin to the channel free of charge. They did not review or approve this review before it was posted and all of the opinions express are my own.

Run Your Own ChatGPT Alternatives with Chat with RTX and GPT4All

My latest video looks at ChatGPT alternatives that can be operated on personal computers, including PCs and Macs.

I first look at Nvidia’s Chat with RTX, a tool enabling users to run a ChatGPT-like chatbot locally. Chat with RTX only works with Nvidia’s newer 30 or 40 series GPUs, which could be a limitation for some users. I tested it on a Lenovo Legion 5 Pro (affiliate link) that had an RTX 4060 GPU on board. Disclosure: the laptop is on loan from Lenovo.

I then tried GPT4All, an alternative open-source large language model client that offers similar functionality to Chat with RTX but without the need for high-end GPU hardware. Like Chat with RTX, GPT4all is user-friendly, requiring minimal setup and no advanced developer tools. GPT4All is compatible with various operating systems, including Macs, Linux, and Windows, broadening its accessibility. However, for optimal performance, 16 GB of system RAM is recommended especially on Windows.

In testing these platforms, I observed that while these AI models are capable, they are not nearly as good as ChatGPT. My test involved having the AI’s summarize one of my prior video transcripts for a blog post. I found that they more often than not got the context of the video wrong and even made stuff up rather than adhering to the facts in the source text it was summarizing.

But this does show how fast AI technology is moving from large data centers into something that can be run locally on a laptop. I was particularly impressed with how fast and responsive GPT4All was on my M2 Macbook Air as compared to a Lenovo Thinkbook running with a 13th generation Intel processor.

Both chat clients allow the user to choose from a number of different large language models. Although I only looked at three of those models in the video, there are many more offered as a free download to explore. These models are being updated all the time so I’m sure we’ll see some rapid improvements as the year progresses.

ChatGPT Saves Me Time by Converting YouTube Transcripts to Blog Posts

I’ve been around for awhile in the tech media space so I’m always weary when the next new “shiny object” emerges on the scene. Google Glass, VR, crypto and NFTs were mega hyped by influencers only to fall way short when it came to mass consumer adoption.

Over the last several months the chattering influencer class has shifted focus almost entirely to artificial intelligence (AI) driven by the very rapid advancements in Large Language Model (LLM) chatbots like ChatGPT. I haven’t heard a peep about NFTs in months!

I approached this new technology with a healthy degree of skepticism. While it certainly has a “gee whiz” factor to it could it actually have some real utility in my day-to-day life?

I decided to pony up the $20 monthly subscription fee for ChatGPT Plus to see if it could save me some time and make my workflow more efficient. And surprisingly – it did. You can learn more in my latest video.

I’ve been using ChatGPT to help write these blog posts based on the transcripts of my YouTube videos for the last month or two. Last week ChatGPT became even more useful through the introduction of plugins that allow ChatGPT to perform tasks that go beyond its pre-existing knowledge cutoff of September, 2021.

One of the plugins I’ve been using is VoxScript, which can pull down full video transcripts from YouTube which the ChatGPT can use to produce summaries for this blog and my email newsletter.

Here’s how it works: I provide ChatGPT with the URL of my YouTube video and ask it to write a summary in the first person in a journalistic, neutral language style. ChatGPT uses VoxScript to pull down the full transcript from the video and starts writing the summary. The result is usually a well-written summary that captures the key points of the video, saving me about 30 minutes to an hour of writing time.

The AI does an impressive job of interpreting the automatically generated YouTube transcripts, even correcting inaccuracies and presenting the information in a coherent manner.

Of course, it’s not perfect, and I do have to tweak some parts to ensure it aligns with my voice and style. But overall, it can generate anywhere from 75-90% of the post depending on what the topic is. This post, for example, needed a little more work done to it by yours truly but the framework it provided was a great time saver.

As AI technology continues to evolve, I’m excited to see how it can further enhance productivity and efficiency in various fields. And AI is more than just chatbots. For example Tesla’s full self driving system is an artificial intelligence neural network running locally on their cars trained to drive a car.

As always, I’m interested in hearing about your experiences with AI. If you’ve found a practical use for AI that has improved your workflow definitely head over to YouTube and share your experiences in the comments section of the video.

Plex Amp Sonic Sage Adds ChatGPT AI Music Recommendations

In my latest video I dive into the world of AI-powered music discovery with the Plex Amp player and its new “Sonic Sage” feature. Sonic Sage uses ChatGPT to deliver playlist recommendations.

Here’s how it works: Sonic Sage interfaces with OpenAI’s GPT model. To get it running, you’ll need an API key from the OpenAI platform. There is a small cost for using this key but I’ve found it to be minimal. So far I’ve only racked up about 5 cents of cost for well over 20 queries.

Once you’ve enabled Sonic Sage, it lives right inside the search icon on your Plex Amp app. ChatGPT uses your queries to generate music recommendations. You can ask it for anything, from general genres to very specific prompts. For example, you could ask for “high energy, lesser-known female rockers from the last 20 years”, and Sonic Sage will whip up a playlist to match.

The AI’s recommendations are based on how you word your prompts. While it’s not perfect at always getting things right, it does a pretty solid job of delivering great music to match what you’re looking for. The only drawback I’ve noticed so far is that these AI-generated playlists can’t be saved, but I’m sure this could change in the future.

This feature works best with a very large personal library or with Tidal, a subscription music service that integrates with Plex and Plex Amp. Tidal costs $8.99 a month if you subscribe through Plex and delivers all of its music as CD quality lossless FLAC audio. I covered the Tidal integration in a previous video.

In my view, Sonic Sage adds an interesting new dimension to Plex Amp’s already awesome music discovery capabilities.