Prof. Frenzel
8 min readAug 5, 2024
What Every Business Analyst Must Know — Part5: Automation

Dear Analysts🔍!

This part of this series is especially close to my heart because it does not only apply to analytics but pretty much every role I have ever been in, and I believe in everything we do from cooking a meal, planning a trip, or doing our taxes. One of the first questions that should begin with every new project is: should I do it myself or use technology? Can I build it or must I outsource it? Or, for many in recent years: can AI do it for me? Many stop asking this question after years on the job, only to find that new colleagues or software solutions have automated parts of their job descriptions. I have been training professionals and college students for over a decade and often receive the question: why should I learn this software or how to code if I can easily do it myself? Why learn something for 10 or even 1000 hours if you can just quickly do it yourself? My immediate response often focuses on scale and risk, but with this article, I hope to offer a more nuanced decision model.

When to Automate

Deciding whether to automate a process or keep it manual can significantly impact efficiency and resource allocation. Automation is best applied to handle repetitive, rule-based tasks, reducing human error and freeing up time for more complex work. However, as we all learned, even with advanced AI tools like ChatGPT and Claude, not all tasks are fit for automation. For example, processes that require constant human judgment, creativity, or those performed infrequently might not benefit from automation or are just simply too expensive to automate.

An approach many businesses adopted is the Hybrid Model that combines centralized and federated models. This allows some parts of the organization to be automated while others remain under central control. Businesses can leverage the benefits of automation for tasks that are well-suited to it while maintaining manual oversight for areas requiring more nuanced decision-making and creativity. In this context, we can see new job titles like “AI supervisor” emerging to oversee and manage artificial intelligence systems and their development within an organization. Their responsibilities typically include assuring the accuracy, efficiency, and ethical operation of AI models, coordinating with data scientists and engineers, monitoring AI system performance, and implementing updates or improvements.

Source: Fottner J, Clauer D, Hormes F, et al. Autonomous systems in intralogistics — state of the art and future research challenges. Logistics Research. 2021;14(1). doi:10.23773/2021_2.

Decision-Making Framework

Before you tackle any automation, you should be well aware of what you are doing. This means understanding the tasks and processes thoroughly. Knowing each step intimately helps identify which parts are best suited for automation and which require manual intervention.

The next step is to make it work — focus on getting your process to function correctly before worrying about design and scalability. This is a principle I learned early in software engineering: prioritize functionality and validate your results first. Aim for a Minimum Viable Product (MVP) that works, even if it’s not perfect. For my Excel users, this might mean using filters or IF functions before you start nesting or using array functions. If you prefer coding in R or Python, start with a simple for loop that is easy to understand before you move on to vectorizing or using recursion. Once you are comfortable with this, you can explore more advanced techniques that enhance efficiency and performance.

Steps to Streamline Data Analytics

1️⃣Understand the Entire Workflow: Knowing each step intimately helps identify which parts are best suited for automation and which require manual intervention. For example, understanding the nuances of data cleaning can help you decide which tasks to automate and which anomalies need a human touch.

2️⃣Assess the Current Pain Points: Identify the most time-consuming and error-prone tasks in your current process. These are often prime candidates for automation. For instance, if generating monthly reports takes an excessive amount of time and is prone to mistakes, automating this process can save time and improve accuracy.

3️⃣Evaluate the Stability of the Process: Automation works best for processes that are stable and repeatable. If a process is constantly changing, it might be better to keep it manual until it stabilizes. For example, a stable process for data ingestion from known sources can be automated, while a constantly evolving data integration task might need manual oversight.

4️⃣Consider the Learning Curve: Factor in the learning curve for using automation tools. Choose tools that fit your team’s expertise and are scalable as their skills grow. For instance, starting with a simple spreadsheet tool like Excel for basic tasks can pave the way to using more complex platforms like SQL or Cloud Services as the team becomes more proficient.

5️⃣Pilot and Iterate: Start with a pilot project to test the automation on a small scale before rolling it out fully. This approach allows you to tweak and optimize the automation process without disrupting the entire workflow.

Generally, automation is most beneficial for tasks that are rules-based, performed frequently, and part of a stable process. To determine which aspects of data analytics to automate, I would suggest considering the following criteria:

  1. Task Frequency and Repetitiveness: Tasks performed frequently and repetitively are ideal for automation. For example, automating the generation of daily sales reports can save substantial time and reduce the risk of manual errors.
  2. Task Complexity: Simple, rule-based tasks are easier to automate than those requiring nuanced judgment. Automating data cleaning processes, such as removing duplicates and standardizing formats, can streamline workflows.
  3. Data Volume: Large datasets can be overwhelming for manual analysis, making automation attractive. Automating the aggregation and initial analysis of big data sets can provide quick insights and free up time for deeper analysis. For example, an e-commerce company processing terabytes of customer behavior data can automate initial trend analysis to quickly identify purchasing patterns.
  4. Error Risk: Automating tasks can reduce human error, especially in data preparation and cleaning. For instance, an automated system can consistently flag and correct anomalies in data, ensuring higher accuracy.
  5. Cost Considerations: Weigh the cost of automation tools and development against potential savings. Investing in an automated ETL (Extract, Transform, Load) tool might be justified if it significantly reduces the man-hours required for data processing over time. For instance, a logistics company might invest in such tools to automate data extraction from shipment records, saving $100,000 annually in labor costs.
  6. Flexibility: While automation excels in handling repetitive tasks, manual processes offer greater flexibility. Automating flexible tasks can be challenging! For instance, a creative agency might rely on manual methods for tasks requiring artistic judgment, such as designing bespoke marketing campaigns.
  7. Integration: Automated systems should seamlessly integrate with existing tools and workflows. For instance, an automated CRM (Customer Relationship Management) system that integrates with marketing and sales platforms can provide real-time customer insights, enhancing cross-departmental collaboration.
  8. Maintenance: Consider the long-term maintenance required for automated systems. Automated solutions might need regular updates and monitoring to stay effective.

Automation in Data Analytics

Automation in data analytics offers numerous benefits, and with new developments in the AI landscape, several tasks that required deep coding knowledge in the past (e.g., text mining or integrating API hooks) are now easy to set up. Here are some specific use cases that might be interesting to you:

  • Data Extraction: Automating data extraction, such as web scraping, can gather large volumes of data from diverse sources efficiently. Tools like Beautiful Soup and Scrapy can automate the process of extracting data from websites, APIs, and databases. For instance, a market research firm can use Scrapy to gather data from competitor websites, significantly reducing the time spent on manual data collection and ensuring comprehensive data acquisition for analysis.
  • Anomaly Detection: Automation enhances anomaly detection processes by identifying unusual patterns in data without human intervention. Tools like Splunk and Anodot can monitor data in real-time to detect anomalies.
  • Dashboard Creation and Reporting: Automating the generation of dashboards and reports provides real-time insights with minimal manual intervention. Tools like Tableau and Power BI can automatically pull data from various sources, update visualizations, and distribute reports.
  • Data Preparation: Automating data preparation tasks like data cleaning, transformation, and feature engineering can significantly speed up the data analysis process. Platforms like Alteryx and KNIME can automate the entire data preparation workflow.
  • Data Validation: Automated data validation systems can detect typos, flag and impute missing values, and identify content and formats that don’t match a dynamic data model. Tools like Trifacta and DataRobot can automate these validation processes.
  • Model Training and Evaluation: Automating the training and evaluation of machine learning models can streamline the development cycle. Tools like Amazon SageMaker and Google AutoML can automate these processes, enabling more frequent updates and improving model accuracy. A healthcare analytics company might use SageMaker to automate the training of predictive models for patient outcomes, increase predictive accuracy, and enable more personalized patient care.
  • Data Maintenance: Automation simplifies data maintenance tasks such as modifying and tuning a data warehouse. Tools like Talend and Informatica can facilitate the automatic integration of new data sources and migration from legacy systems. An enterprise might use Talend to update its customer database continuously, reducing manual data entry errors and maintaining data consistency across the organization.

Manual Processes in Data Analytics

While automation offers significant advantages, there are scenarios where manual processes are more appropriate. Manual methods provide the flexibility and human intuition needed for complex and unique tasks that automation cannot easily handle. Here are some specific use cases:

📌 In-depth Analysis: Certain tasks, such as developing hypotheses, interpreting complex results, and making strategic decisions, benefit from the nuanced understanding that only humans can provide. For example, a business analyst manually reviewing customer feedback to uncover underlying sentiments and trends can offer insights that automated sentiment analysis tools might miss.

📌 Flexibility and Adaptability: Manual processes are essential for tasks requiring a high degree of flexibility and adaptability. For instance, when exploring new datasets or conducting initial exploratory data analysis, human intuition and creativity play a key role. A data scientist might manually examine a new dataset to identify patterns, outliers, or potential hypotheses before selecting the best analytical approach.

📌Complex Decision Making: Tasks involving high-stakes decisions that require contextual understanding and nuanced judgment typically remain manual. For instance, in fraud detection, while machine learning models can identify suspicious transactions, human analysts are needed to review and validate these cases, especially when dealing with ambiguous or high-risk scenarios.

📌 One-time or Unique Projects: For unique projects where automation setup would be too time-consuming or costly compared to the one-time use case, manual processes are preferable (e.g. ad-hoc reports).

📌 Initial Setup and Customization: Manual effort is often necessary during the initial setup of analytics projects to customize models, frameworks, and strategies according to specific project requirements.

As automation tools advance, the scope for automation expands, offering greater efficiency and insights. However, the human element remains indispensable for tasks requiring judgment, creativity, and strategic thinking. The integration of GenAI tools will further enhance data analytics, support more sophisticated data interpretation and decision-making while letting human analysts focus on high-value activities. This synergy will help organizations stay competitive in an increasingly data-driven world.

Prof. Frenzel

Data Scientist | Engineer - Professor | Entrepreneur - Investor | Finance - World Traveler