Butterfly Effect: How a backup cleanup job led me to AI research
I cut my teeth on data and ML through PBL (Project Based Learning)
As a tech enthusiast and lifelong learner, I've always believed in the power of hands-on experience. Today, I want to share with you two pivotal moments in my career that exemplify the incredible potential of Project-Based Learning (PBL). These experiences not only shaped my professional trajectory but also instilled in me some core beliefs about automation and the future of technology.
What is Project-Based Learning?
Before we dive into my experiences, let's briefly define PBL. It's an educational approach where learners gain knowledge and skills by working on real-world projects over an extended period. Instead of passive learning, PBL encourages active problem-solving, critical thinking, and self-directed exploration. It's about learning by doing, facing real challenges, and finding innovative solutions.
Project 1: Taming the Backup Beast at Cisco
My first significant encounter with PBL came during my time as a Virtualization Infrastructure Engineer at Cisco. Fresh out of the gate, I felt like an impostor in this tech giant. My initial role was quite modest - I was essentially a low-level lackey, building virtual machines and installing operating systems. Little did I know that an opportunity to prove myself was just around the corner.
The Challenge:
A few months into my job, our team’s architect approached me with a seemingly simple task. He sent me a screenshot of our data stores, which were overflowing with snapshots (backups). His instruction was equally simple: “We have too many backups. Do something about it.” This open-ended directive became the catalyst for a project that would span several weeks and teach me invaluable lessons.
We have too many backups. Do something about this.
In this, there’s a key lesson on PBL: open-ended projects! They also need to be self-directed. There are numerous stories of tech legend about open-ended directives such as this. Allegedly, Google’s PageRank came from an almost identical example. A higher-up (maybe Bring? Not sure) was trying to use Google to find camping gear, but the results were garbage. So he took a screenshot of the search query and results, and wrote in big red marker THESE RESULTS SUCK and taped it up in the engineer’s meeting room. The rest was up to them. My architect did the same thing.
![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a953268-3aa0-4cbf-8fe8-e9a3b2d87e51_1456x816.png)
The Process:
Initially, I tackled the problem manually. I would look at the data attached to the backups and email individual engineers, asking if I could delete their old snapshots. To my surprise, most of them didn't even realize these backups existed and were more than happy to have them removed. “Oh sure, thanks.” And then we’d save a few hundred GB of data from our SAN array at a time.
As I noticed this pattern, a light bulb went off. What if we could automate this process? The manual approach was effective but incredibly time-consuming, especially considering we had thousands of engineers. Doing it manually was simply not sustainable.
This realization led me down the path of scripting and coding. I needed to learn how to:
Extract information from multiple systems
Identify the engineers responsible for each backup
Find their email addresses
Automatically send deletion requests
The Challenges:
The main hurdle was integrating multiple complex systems:
VMware vSphere: Our virtualization platform, where the backups were stored
Active Directory: For identity management and finding user information
Exchange: To send automated emails
I primarily used PowerShell for scripting, with some SQL for database queries. As someone relatively new to scripting, this was a steep learning curve. I had to quickly become proficient in these technologies, understanding how they interacted and how to extract the information I needed.
The Results:
After a few weeks of intense work and learning, I launched my script. The results were immediate and impressive. We watched as deletion after deletion went through, freeing up terabytes of high-tier storage. In financial terms, this translated to savings of tens of thousands of dollars. SAN storage arrays are really expensive.
But the impact went beyond just storage savings. This project demonstrated the power of automation and earned me recognition within the team. It was a turning point in my career, shifting me from a “lackey” to someone capable of solving complex, large-scale problems.
The Lesson:
This project taught me a crucial lesson that has stuck with me throughout my career: “If you can automate it a thousand times, you can automate it a billion times.” I realized that automation, once set up correctly, is infinitely scalable. This insight would prove invaluable in my future endeavors.
If you can automate it a thousand times, you can automate it a billion times."
I would eventually make the pivot and focus on automation almost exclusively, even though my expertise was virtualization and infrastructure. Here’s how I viewed it:
I could spend hours doing things manually, the hard way, every time
OR
I could spend hours creating permanent automation solutions, saving myself hours of work every week
The “hours saved” quickly exceed 40 hours per week, then hundreds, and finally thousands of hours.
This might sound hyperbolic, but I’m not kidding you. I got to the point where my entire infrastructure was automated, and I had no technical debt and no backlog, and most importantly, no downtime.
Want to learn PBL from me?
I’ll be teaching a PBL course this week on my Pathfinder community. This is part of my Five Pillar Pathfinder’s Journey, for which PBL is part of Pillar 1: Become an AI Power User.
The Pathfinder’s Journey:
💻 Become an AI Power-User: Demystify AI through hands-on experience
📊 Understand Economic Changes: Navigate the shifting landscape of work
🌿 Back to Basics Lifestyles: Reconnect with your human essence
🧑🤝🧑 Master People Skills: Enhance your uniquely human capabilities
🎯 Radical Alignment: Discover and live your true purpose
Members of my community are coming from all levels, with all kinds of goals. We have creatives, engineers, business leaders, and everything in between. PBL applies to individuals and enterprises.
Project 2: Predicting the Stock Market with Machine Learning
Emboldened by my success at Cisco and driven by a growing curiosity about artificial intelligence, I decided to embark on a personal PBL journey. My goal? To create a machine learning model that could predict stock market movements. This project, which I pursued partly during my office hours due to its overlap with my work, would consume about six months of my life and push my learning to new heights.
The Challenge:
I set myself an ambitious task: download daily stock data for the entire American stock market, analyze historical trends, and use machine learning to predict future movements. As someone with no formal training in data science or machine learning, this was a monumental undertaking.
The Process:
I started with the basics. I knew I wanted to use Python, as it was becoming the go-to language for data science. I also knew about scikit-learn (sklearn), a popular machine learning library. But beyond that, I was stepping into unknown territory.
Here's how I approached the problem:
Data Collection: I set up systems to download daily stock data for every stock on the American market. I ended up paying for curated API access, a subscription that was $600 per year.
Data Preprocessing: I had to clean this data, handle missing values, and normalize it to make it suitable for machine learning.
Feature Engineering: This was perhaps the most challenging and time-consuming part. I had to figure out which aspects of the stock data were most relevant for prediction. I experimented with various technical indicators, price patterns, and even tried to incorporate some fundamental data.
Model Selection and Training: I chose to use Support Vector Machines (SVM) from sklearn, as they seemed well-suited for this type of prediction task. I spent countless hours tweaking parameters and trying different approaches.
Backtesting and Refinement: I continuously tested my model on historical data, refining and adjusting as I went along.
The Challenges:
Learning feature engineering “the hard way” was undoubtedly the biggest challenge. I had to take raw, tabular data and transform it into something meaningful for the machine learning model. This involved a lot of trial and error, reading academic papers, and countless iterations.
Another significant challenge was computational power. Processing data for the entire stock market and training models on it required substantial resources. I had to learn about optimization techniques and efficient data handling to make the project feasible on my personal computer.
The Results:
While the stock prediction model itself didn’t yield significant financial gains, the skills and insights I gained were invaluable. I learned about the complexities of financial markets, the challenges of prediction, and the intricacies of machine learning. I only made a few investments based on my stock data project, and they were… meh.
![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb43656-cabf-44be-b2f4-cc5d146b90b5_2912x1632.png)
Interestingly, I later applied similar techniques to cryptocurrency trading. Using the skills I had developed, I created an auto-trading system that managed to make about $600 over 50,000 transactions. While not a fortune, it was a practical demonstration of the power of the skills I had acquired.
Once I had something working, I saved it on a DVD and gave it to my wife and said “If anything ever happens to me, use this to take care of yourself.” That was, perhaps, a bit premature of me. I would later learn that there were entire hedge funds based on the use of AI/ML to make all decisions, some of them highly effective.
The Impact of PBL on My Career and Worldview
These two projects were more than just learning experiences; they were transformative. They led me to several profound realizations:
Automation Potential: I came to believe that a sufficiently skilled automation engineer could potentially automate a significant portion of today's economy. This insight has only become more relevant with the advent of advanced AI.
AI's Transformative Power: I realized that AI is dramatically lowering the barrier to entry for automation while expanding the flexibility of computers. Tasks that once required complex, custom-built systems can now be approached with more generalized AI tools.
Skill Transferability: The skills gained from one project often have unexpected applications in future endeavors. My journey from backup management to stock prediction to cryptocurrency trading to AI alignment is a testament to this.
The Power of Self-Directed Learning: These projects showed me that with dedication and curiosity, one can acquire complex skills and tackle significant challenges, even without formal training.
Most importantly, these PBL experiences directly contributed to my current work in AI, the launch of my YouTube channel, and my participation in research. The ability to automate large-scale data processing has been crucial in fine-tuning AI models and curating datasets. Without these foundational experiences, I might never have found my way into the AI field.
How this led directly to my career in AI
What started as an open-ended directive of “Do something about these backups” led me to a series of automation projects where I was handling increasingly large and complex amounts of data, which led directly to my ability to automate complex technology processes: some of them far more complex than cleaning up backups. My crowning achievement at Cisco was writing automation scripts to update the OS and firmware of our entire virtualization infrastructure. That’s… a longer story. Either way, when GPT-2 and GPT-3 came out, I was ready to go with my Python, PowerShell, RegEx, and ML expertise. The rest, as they say, is history.
You can see those automation skills in my earliest GPT-3 tutorials below:
How to PBL
Embarking on a Project-Based Learning (PBL) journey begins with choosing a real-world problem or goal that captures your interest. This might be a challenge at work or a personal project you are passionate about. For example, at Cisco, I was tasked with managing an overflow of backups, which initially seemed daunting but quickly became an exciting opportunity to explore automation.
Once you've identified your project, dive into self-directed exploration. This is where PBL truly shines. You'll need to gather the resources and knowledge necessary to tackle the challenge. This could involve learning new programming languages, familiarizing yourself with relevant tools, or understanding complex systems. In my case, I taught myself PowerShell scripting to automate backup processes and Python for stock market analysis. The key is to approach learning as an ongoing journey, acquiring skills as they become relevant to your project.
As you engage with your project, break it down into manageable steps and start tackling them one by one. For instance, automating the backup process involved first understanding the data, then figuring out how to integrate systems and automate emails. This incremental approach allows you to maintain momentum and continuously learn from each step you take.
Experimentation and iteration are crucial components of PBL. You'll encounter challenges and setbacks, but these are opportunities to refine your approach and deepen your understanding. In the stock market prediction project, I spent countless hours tweaking machine learning models and adjusting parameters, learning more with each iteration. Embrace this process, as it leads to innovative solutions and a deeper grasp of the subject matter.
Throughout your PBL journey, stay open to transferring skills and insights to new areas. The knowledge you gain is often applicable to various domains, as I found when transitioning from automation to financial modeling and later to cryptocurrency trading. This adaptability is a testament to the practical nature of PBL, preparing you for diverse challenges and opportunities. There are no wasted skills or problems in this space.
Project-Based Learning is about taking control of your education through real-world projects. By identifying a problem, exploring solutions, breaking it down into actionable steps, experimenting, and iterating, you can harness the full potential of PBL. This approach not only equips you with practical skills but also fosters a lifelong passion for learning and discovery. Whether you're tackling technical challenges at work or pursuing personal interests, PBL offers a framework for success through hands-on experience and continuous growth.
In a nutshell
Identify a Real-World Problem: Choose a practical challenge or project that you are interested in solving.
Research and Plan: Gather information and resources you need. Identify the skills and tools required to address the problem.
Break Down the Project: Divide the project into smaller, manageable tasks to tackle one step at a time.
Learn by Doing: Begin working on the tasks, using hands-on experience to develop your skills and understanding.
Experiment and Iterate: Test your solutions, learn from mistakes, and continuously refine your approach.
Reflect and Apply: Evaluate what you've learned and apply these skills to new problems or projects in the future.