The Silent Struggle: Why Most Side Hustles Fail Before They Even Begin (And How to Survive the Grind)


The Silent Struggle: Why Most Side Hustles Fail Before They Even Begin (And How to Survive the Grind)

The Hard Truth No One Tells You

You finally gathered the courage to start that side hustle. You shared your dream with colleagues over lunch. Their response?

“Oh, nice hobby. But be careful – my cousin tried that and failed.”
“Stick to your job – at least it’s stable.”
“Who’s going to buy from YOU?”

Suddenly, your excitement feels foolish. The first month passes with just ₹3,000 in earnings. Your spouse asks, “Is this really worth your time?”

This is where 95% of dreams die – not from lack of potential, but from the unbearable weight of early struggles.


The 5 Enemies You’ll Face (And How to Beat Them)

1. The Confidence Killer

  • Reality: Your first 10 blog posts will get 3 views (all from your mom)
  • Solution:
    “I’m not failing – I’m collecting data on what doesn’t work.”
    → Track tiny wins (e.g., “Today I learned how to run a Facebook ad”)

2. The Toxic Chor Committee

  • Office gossip: “Look at Sharma ji’s son – wasting time on YouTube instead of MBA.”
  • Power Move:
    → Stop sharing plans with negative people
    → Create a “support squad” of 2-3 fellow hustlers

3. The Time Trap

  • Between job, kids, and chores, you’re exhausted by 10 PM
  • Hack:
    → The 90-Minute Rule (5 AM or 10 PM – claim one undisturbed slot)
    → Delegate/outsource household tasks (even ₹500/week for laundry buys you 3 extra hours)

4. The Comparison Curse

  • Seeing others’ “overnight success” while you struggle
  • Truth Bomb:
    → That “instant” influencer actually grinded for 2 years before going viral
    → Your Day 30 vs. Their Day 300 – not a fair fight

5. The Money Mirage

  • Expecting ₹50K/month in Month 2
  • Mindshift:
    → Treat first 6 months as “paid education” (what you learn is worth more than earnings)
    → Set process goals (“I’ll contact 10 clients/week”) not outcome goals

The Darkest Before Dawn: 3 Real Stories

The Darkest Before Dawn
  1. Ramesh (42), Bank Clerk → Catering Business Owner
  • First 8 months: Only 3 orders (from relatives)
  • Month 14: Landed a corporate contract
  • Now: Runs a ₹8L/month operation with his son “Almost quit after wasting ₹35,000 on failed recipes. Thank God I didn’t.”
  1. Priya (29), Teacher → Kids’ Book Author
  • Initial sales: 17 copies (mostly friends)
  • Persisted with school workshops
  • Year 3: ₹2L/month from book royalties + printables
  1. Amit (36), IT Employee → Stock Educator
  • First 50 YouTube videos: <100 views
  • Kept improving thumbnails/titles
  • Now: 2.7M subscribers, left job at 41

Your Survival Blueprint

Phase 1 (Months 1-6): The Silent Grind

  • ✔️ Expect nothing → Celebrate consistency
  • ✔️ Document the journey (future motivational fuel)
  • ✔️ Find 1 mentor (online counts)

Phase 2 (Months 7-18): The Tipping Point

  • ✔️ Double down on what works
  • ✔️ Automate/outsource repetitive tasks
  • ✔️ Reinvest first profits wisely

Phase 3 (Year 2+): The Breakthrough

  • ✔️ “Aha!” moment when systems click
  • ✔️ Side income = 50%+ salary → Escape options open


Starting a new venture or side hustle is an exciting journey – but it’s also one filled with unseen challenges that can shake your confidence and test your patience. If you’ve ever felt discouraged by slow progress, skeptical peers, or early setbacks, you’re not alone. These are common experiences that every entrepreneur or hustler faces, and understanding them is key to pushing through.

The Hidden Battle: Low Self-Esteem and Negative Criticism

Low self-esteem is among the first obstacles. When you present your dream to colleagues, friends, or family members, you could be met with doubts or negative criticism. These kinds of conversations tend to seed doubts. For instance, most startup entrepreneurs are initially faced with such doubts.

It’s all part of the process. Self-confidence isn’t achieved immediately; it gains momentum over a period of time as you rack up little wins. Experts suggest starting slowly, starting with areas where you already have skills and experience to build a foundation of success.

The Pressure of Balancing Responsibilities

As you build your side business, you juggle official employment, family, and personal life. No one else in your vicinity will even notice the extra hours that you put in. This invisibility makes it even harder to defend the work, leaving you more inclined to quit. The trick is to remind yourself that your dream entails sacrifices that others do not value.

How to Carry On When You Feel Like Giving Up

  • Work in Silence, Let Results Speak: Do not boast about every difficulty. Concentrate on persistent effort as opposed to instant reward.
  • Set Realistic Expectations: Realize that every company takes time to mature. Do not anticipate overnight success.
  • Build Confidence Gradually: Begin with small, achievable goals in your areas of strength. Mark milestones to provide an uplift to morale[4].
  • Obtain Mentorship and Encouragement: It is not excessive to consult with successful entrepreneurs or online communities even if the first few tries are rejected[3].
  • Plan in Advance: Prepare a business plan with your targets, marketing, finances, and schedules. Planning minimizes uncertainty and prepares you for problems[6].

Why You Have to Keep Trying

If you choose to give up when things get difficult, your dreams will be nothing more than dreams. To live life beyond the ordinary, you must be willing to do what others will not: persevere through the quiet struggle. Remember, every successful entrepreneur has experienced failure and disappointment. What sets them apart is their determination to keep moving forward in spite of these failures.

The Life-Changing Perspective Shift

Every late night, every ignored insult, every failed attempt is:
→ Depositing into your future freedom account
→ Writing the “how I made it” story your kids will tell
→ Building the unshakable confidence that comes only from overcoming doubt

This struggle isn’t your obstacle – it’s your unfair advantage. Those who quit never develop the resilience your journey is forcing you to build.


Final Rally Cry:
“When you’re tempted to quit, remember: The version of you that survives this grind will laugh at what once seemed impossible. Keep going – your future self is counting on you.”

P.S. Feeling stuck right now? Do this:

  1. Grab paper → Write today’s date + “I WILL NOT QUIT”
  2. List 3 micro-actions for this week (e.g., “Post 1 Reel,” “Email 5 clients”)
  3. Put it where you’ll see it daily (bathroom mirror/wallet)

Using machine learning to predict stock prices

Machine learning has become a powerful tool for predicting stock prices, as it allows for the analysis of large amounts of data and can identify patterns that humans may not be able to discern. In this blog post, we’ll explore how machine learning is used to predict stock prices and some of the challenges that come with this approach.

Time series forecasting

One of the most popular methods for predicting stock prices using machine learning is called “time series forecasting.” This approach involves using historical data on stock prices, such as daily closing prices, to train a model that can then be used to make predictions about future stock prices. The model looks at patterns in the historical data, such as trends and seasonality, to make predictions about future prices.

Sentiment analysis

Another popular method is called “sentiment analysis,” which uses natural language processing (NLP) to analyze news articles, social media posts, and other text data to determine the overall sentiment or tone of the market. The idea is that if the sentiment is positive, the market will likely go up, and if the sentiment is negative, the market will likely go down.

Dynamic nature and complexity of the stock market

One of the challenges with using machine learning to predict stock prices is that the stock market is highly dynamic and constantly changing. This means that models need to be constantly retrained and updated to take into account new data and changing market conditions. Additionally, it is hard to get accurate data and feature engineering is crucial for the model performance.

Another challenge is the complexity of the stock market itself, with many factors impacting stock prices such as company performance, economic indicators, and global events. This means that a machine learning model may not be able to take all of these factors into account and may produce inaccurate predictions as a result.

Machine learning algorithms to predict stock prices

There is no single machine learning algorithm that is guaranteed to provide the most accurate predictions of stock prices. The best algorithm depends on the specific characteristics of the data, such as the time period being analyzed and the presence of any specific trends. That being said, some of the more commonly used machine learning algorithms for stock price prediction include:

  1. Artificial Neural Networks (ANNs) – ANNs are used to model complex relationships between inputs and outputs, making them well-suited for stock price prediction.
  2. Support Vector Machines (SVMs) – SVMs are used for classification and regression tasks, and have been applied to stock price prediction to identify trends and make predictions based on historical data.
  3. Decision Trees and Random Forests – Decision trees and random forests are used for classification and regression tasks, and can be applied to stock price prediction by analyzing the relationships between stock prices and a variety of factors, such as economic indicators, company-specific news and events, and global events.
  4. Time series analysis (ARIMA, SARIMA, etc.) – Time series analysis methods are used to model time-dependent data, and are often applied to stock price prediction by analyzing trends and patterns in historical stock data.

Regardless of the algorithm used, it is important to have a solid understanding of the stock market and to thoroughly validate and test the model before using it to make any investment decisions.

Irreplaceable human judgement and knowledge

Despite these challenges, machine learning has the potential to revolutionize the way we predict stock prices. With the increasing availability of data and advances in machine learning techniques, it’s likely that we’ll see more and more accurate predictions in the future. However, it is important to note that stock prices are highly unpredictable and machine learning should be used as one of the tools in the decision making process.

Stock trading with AI algorithms

Algo trading is very popular nowadays amongst systematic traders. They just hand over the decision-making process to a few pieces of code and sit back. The backtested code runs on some logic set by the trader with a certain probability of profitability.

So, why not take the advantage of Machine Learning to develop a concrete trading system with a higher winning percentage?

Trading with machine learning typically involves using algorithms to analyze large amounts of historical market data, identify patterns and trends, and make predictions about future price movements. These predictions can then be used to inform trading decisions, such as when to buy or sell a particular security.

However, it’s important to note that even the most sophisticated machine learning algorithms cannot guarantee profits and carry significant risks. A well-designed machine learning model should be validated and tested thoroughly on historical data before being used to make investment decisions. It’s also important to be aware of the limitations of machine learning algorithms and to use them in conjunction with other forms of analysis, such as fundamental analysis and technical analysis.

Final words

Machine learning has become a powerful tool for predicting stock prices, with time series forecasting and sentiment analysis being two of the most popular methods. While there are challenges that come with using machine learning in this context, such as the dynamic nature of the stock market and the complexity of the factors that impact stock prices, advances in machine learning techniques have the potential to lead to more accurate predictions in the future. As always, it’s important to use a variety of tools and approaches to make investment decisions, and not to rely solely on machine learning predictions.

Career in data science

A career in data science has a lot to offer, from hands-on learning to the chance to contribute to real-world projects.

A career in data science is a highly sought-after and rewarding field that is projected to continue growing in demand. As businesses and organizations increasingly rely on data to make decisions, the need for skilled data scientists to extract insights from that data is becoming more pressing.

Career opportunities in data science

There are many career opportunities in data science, including roles such as:

Data Analyst: responsible for collecting, analyzing, and interpreting large sets of data.

Data Engineer: responsible for designing, building, and maintaining the infrastructure and systems that support data science efforts.

Machine Learning Engineer: responsible for designing and developing models that can learn from data, and deploying those models to production.

Business Intelligence Analyst: responsible for creating and maintaining reporting and analysis systems that provide insights to support business decision-making.

Data Scientist: responsible for using statistical and machine learning techniques to extract insights from data and communicate those insights to stakeholders.

Research Scientist: responsible for developing new techniques and algorithms in the field of machine learning and data science, often in an academic or research setting.

Data science: an amalgamation of different skills

A career in data science typically involves using a combination of computer science, statistics, and domain expertise to analyze large sets of data and extract insights that can be used to inform business decisions. This can include tasks such as building predictive models, identifying patterns and trends in data, and developing algorithms to automate decision-making.

Data science is an interdisciplinary field and demand is high, thus data scientists are in high demand across many industries, including technology, finance, healthcare, retail, and more.

The role played by data science professionals

A data scientist is a professional who is responsible for using statistical and machine-learning techniques to extract insights from data and communicate those insights to stakeholders. They play a key role in the data-driven decision-making process in an organization.

Specific responsibilities

The specific responsibilities of a data scientist can vary depending on the industry and the company, but some common tasks include:

  • Collecting and cleaning large sets of data from various sources.
  • Exploring and analyzing the data using statistical and machine learning techniques.
  • Building and implementing models that can learn from data.
  • Communicating insights and findings to stakeholders through visualizations, reports, and presentations.
  • Deploying models to production systems.
  • Continuously monitoring the performance of models and updating them as needed.
  • Collaborating with cross-functional teams to identify new opportunities for data-driven decision-making.

Data scientists often work with large and complex data sets, and they need to be proficient in a variety of tools and technologies, such as programming languages like Python and R, data visualization tools like Tableau and PowerBI, and machine learning libraries like sci-kit-learn and TensorFlow.

Data Scientists are in high demand across many industries, including technology, finance, healthcare, retail, and more. With the growth in data and the increasing importance of data-driven decision making, the role of data scientist is becoming increasingly important in organizations.

Prerequisites to study data science

To pursue a career in data science, it is typically recommended to have a strong background in mathematics and computer science, as well as experience with programming languages such as Python or R. Many data scientists also have advanced degrees in fields such as statistics, computer science, or electrical engineering.

In addition to strong technical skills, data scientists should also have excellent problem-solving and communication skills. The ability to translate complex technical concepts into plain language and to work with cross-functional teams is essential to be successful in this field.

There are various roles and career paths within the field of data science. Some data scientists may specialize in a particular area, such as machine learning or natural language processing, while others may work on a wide range of projects. Some of the popular roles in data science are a data analyst, data engineer, data architect, machine learning engineer, and data scientist.

The demand for data scientists continues to rise as organizations of all sizes and industries look to leverage data to drive growth and improve decision-making. According to a report from Glassdoor, a data scientist is among the top jobs in the United States and is expected to continue growing in demand in the coming years.

Salary packages for data science professionals

Salary packages for data science roles vary depending on factors such as location, industry, experience level, and specific job responsibilities. However, in general, data science roles tend to be well-paying.

According to data from Glassdoor, the average salary for a data scientist in the United States is around $120,000 per year, with some positions paying as much as $160,000 or more. Data engineers and machine learning engineers tend to earn slightly less, with an average salary of around $105,000 per year. Business intelligence analysts and data analysts tend to earn slightly less, with an average salary of around $70,000 – $90,000 per year.

It’s important to note that the salary packages also vary based on the location, with the highest paying locations being San Francisco, Seattle and New York City. Also, the level of experience, skill set and certifications would also have an impact on the salary package.

Salary of a data scientist in India

The salary of a data scientist in India can vary depending on factors such as location, industry, experience level, and specific job responsibilities. However, on average, data science roles in India tend to be well-paying.

According to data from Glassdoor, the average salary for a data scientist in India is around INR 12,00,000 per year (or roughly USD 16,500), with some positions paying as much as INR 20,00,000 (or roughly USD 28,000) or more. Data engineers and machine learning engineers tend to earn slightly less, with an average salary of around INR 8,00,000 (or roughly USD 11,000) per year. Business intelligence analysts and data analysts tend to earn slightly less, with an average salary of around INR 6,00,000 (or roughly USD 8,500) per year.

It’s important to note that the salary packages also vary based on the location, with the highest paying locations being the metropolitans such as Mumbai, Delhi, and Bengaluru. Also, the level of experience, skill set and certifications would also have an impact on the salary package.

Conclusion

In conclusion, a career in data science is a challenging and rewarding field that is growing in demand. With strong technical skills, problem-solving abilities, and communication skills, data scientists can find a wide range of opportunities in various industries. With the right education, skills, and experience, you can be well on your way to a successful career in data science.

Machine learning for beginners

Machine learning is a rapidly growing field that is changing the way we interact with technology. It is a method of teaching computers to learn from data, without explicitly programming them. This allows computers to identify patterns and make predictions, making it a powerful tool for solving complex problems.

If you’re new to machine learning, it can be overwhelming to know where to start. In this article, we will provide a beginner’s guide to machine learning, covering the basics and providing an overview of the most common techniques.

Supervised and unsupervised learning

First, it’s important to understand the difference between supervised and unsupervised learning. Supervised learning is when the computer is provided with labeled data, which means that the correct output is already known. This type of learning is used for tasks such as image classification, where the computer is shown an image and must identify what is in the image.

Unsupervised learning, on the other hand, is when the computer is not provided with labeled data. Instead, it must identify patterns and relationships within the data on its own. This type of learning is used for tasks such as clustering, where the computer groups similar data together.

Overfitting of models

Another important concept in machine learning is overfitting. This occurs when a model is too complex and performs well on the training data but poorly on new, unseen data. To prevent overfitting, it’s important to use techniques such as cross-validation and regularization.

Important machine learning algorithms

There are several popular machine learning algorithms that are commonly used, including:

Linear regression: used for predicting a continuous outcome

Linear regression is a supervised machine learning algorithm used for predicting a continuous outcome. The goal of linear regression is to find the best linear relationship between the input variables (also known as independent variables or predictors) and the output variable (also known as the dependent variable or target). It does this by finding the line of best fit, represented by the equation:

y=b0+b1x2+….+bn*xn

Where y is the predicted value, x1, x2, …, xn are the input variables, and b0, b1, b2, …, bn are the coefficients that need to be learned. These coefficients are learned by minimizing the difference between the predicted values and the true values.

Linear regression is a simple and interpretable algorithm that makes it easy to understand the relationship between the input variables and the output variable. However, it has some limitations, such as the assumption that the relationship between the variables is linear, which may not always be the case. In such situations, more complex algorithms such as polynomial regression or non-linear regression may be used.

Linear regression can be implemented in various programming languages such as Python, R, and Matlab. The most popular libraries for implementing Linear Regression are scikit-learn, statsmodels and tensorflow.

In summary, Linear Regression is a basic yet powerful algorithm for predicting a continuous outcome, it is easy to implement and interpret, and it is widely used in various fields such as finance, economics, and engineering.

Logistic regression: used for predicting a binary outcome

Logistic regression is a supervised machine learning algorithm used for predicting a binary outcome. It is a variation of linear regression, where the goal is to model the probability of a certain class or event occurring. The logistic function (also called the sigmoid function) is used to map the input variables to a probability between 0 and 1. This function is represented by the equation:

p(x) = 1 / (1 + e^-(b0 + b1x1 + b2x2 + … + bn*xn))

Where x1, x2, …, xn are the input variables, b0, b1, b2, …, bn are the coefficients that need to be learned, and p(x) is the predicted probability of the event occurring.

The logistic regression algorithm uses the logistic function to estimate the probability of the event occurring and uses a threshold (usually 0.5) to classify the outcome as either 0 or 1.

Logistic regression is a widely used algorithm for classification problems, it is easy to implement and interpret, and it can handle both linear and non-linear relationships between the input variables and the output variable. However, it has some limitations, such as the assumption that the relationship between the variables is log-linear, which may not always be the case. In such situations, more complex algorithms such as decision trees or support vector machines may be used.

Logistic regression can be implemented in various programming languages such as Python, R, and Matlab. The most popular libraries for implementing Logistic Regression are scikit-learn, statsmodels and tensorflow..

In summary: Logistic Regression is a powerful algorithm for predicting a binary outcome, it is easy to implement and interpret, and it is widely used in various fields such as medicine, finance and social sciences.

Decision trees: used for both classification and regression tasks

Decision Trees is a supervised machine learning algorithm used for both classification and regression tasks. It is a tree-based model where each internal node represents a test on an attribute, each branch represents the outcome of a test, and each leaf node represents a class label.

The idea behind decision trees is to recursively partition the data into subsets based on the values of the input features. The algorithm starts at the root node and selects the feature that best splits the data into subsets with the most similar class labels. The process is repeated on each subset of the data until a stopping criterion is met. The final result is a tree of decisions that can be used to make predictions for new data.

One of the main advantages of decision trees is their interpretability. They are easy to understand and visualize, and they can handle both categorical and numerical features. However, decision trees can be prone to overfitting, especially when the tree becomes too deep. This can be addressed by using techniques such as pruning, which removes branches that do not add much value to the tree.

Another popular variation of decision trees is random forests, which is an ensemble of decision trees. Random forests use multiple decision trees and combine their predictions to improve the overall performance of the model.

Decision Trees can be implemented in various programming languages such as Python, R, and Matlab. The most popular libraries for implementing Decision Trees are scikit-learn, R’s rpart package, and caret package.

In summary: Decision Trees is a powerful algorithm for both classification and regression tasks, it is easy to interpret and understand, it can handle both categorical and numerical features but can be prone to overfitting. The Random Forest is an ensemble of Decision Trees, which improve the overall performance of the model.

Random forests: an ensemble of decision trees

Random Forest is an ensemble machine learning algorithm used for both classification and regression tasks. It is a variation of decision trees, where multiple decision trees are trained and combined to make predictions. The idea behind random forests is to reduce the variance and increase the accuracy of the model by averaging the predictions of multiple decision trees.

A random forest algorithm generates multiple decision trees by training them on different subsets of the data. This is done by randomly selecting a subset of the features and a subset of the data points to use for each tree. The final prediction is made by averaging the predictions of all the trees in the forest.

One of the main advantages of random forests is that they are less prone to overfitting than single decision trees. This is because each tree in the forest is trained on a different subset of the data, which reduces the correlation between the trees. Additionally, random forests can handle both categorical and numerical features and are able to capture non-linear interactions between the features.

Random Forest can be implemented in various programming languages such as Python, R, and Matlab. The most popular libraries for implementing Random Forest are scikit-learn, R’s randomForest package, and caret package.

In summary: Random Forest is a powerful ensemble machine learning algorithm used for both classification and regression tasks, it’s less prone to overfitting than single decision trees, it can handle both categorical and numerical features and is able to capture non-linear interactions between the features. It combines the predictions of multiple decision trees to improve the overall performance of the model.

k-nearest neighbors: used for classification and regression tasks

k-nearest neighbors (k-NN) is a supervised machine learning algorithm used for both classification and regression tasks. It is a non-parametric method, which means that it does not make any assumptions about the underlying distribution of the data.

The idea behind k-NN is to classify a new point based on its similarity to other points in the data. The algorithm works by finding the k-nearest data points to the new point, and then the majority class or the average value of the k-nearest points is used to make the prediction.

One of the main advantages of k-NN is its simplicity and interpretability. It requires very little training data and can handle both categorical and numerical features. However, it can be sensitive to the choice of k and to the scale and distribution of the data. To overcome these issues, techniques such as normalization and feature scaling are often used.

The k-NN algorithm can be implemented in various programming languages such as Python, R, and Matlab. The most popular libraries for implementing k-NN are scikit-learn, R’s class and caret package.

In summary: k-nearest neighbors (k-NN) is a simple and interpretable algorithm used for both classification and regression tasks. It classifies a new point based on its similarity to other points in the data. It has the advantage of requiring very little training data and can handle both categorical and numerical features. However, it can be sensitive to the choice of k and to the scale and distribution of the data.

Support vector machines: used for classification tasks

Support Vector Machines (SVMs) is a supervised machine learning algorithm used for classification tasks. It is a powerful and versatile algorithm that can handle both linear and non-linear data.

The goal of an SVM is to find the best boundary (also called a hyperplane) that separates the data points into different classes. The boundary that maximizes the margin, which is the distance between the boundary and the closest data points from each class, is chosen as the best boundary. These closest data points from each class are called support vectors.

SVMs can handle both linear and non-linear data by using a technique called the kernel trick. The kernel trick transforms the input data into a higher-dimensional space where the data becomes linearly separable. In this new space, the algorithm finds the best boundary, and then it is transformed back to the original space.

One of the main advantages of SVMs is that they can handle high-dimensional data and have a high accuracy. However, they can be sensitive to the choice of the kernel and the parameters of the model. Additionally, SVMs can be less efficient with large datasets.

SVMs can be implemented in various programming languages such as Python, R, and Matlab. The most popular libraries for implementing SVMs are scikit-learn, R’s e1071 package, and MATLAB’s fitcsvm.

In summary: Support Vector Machines (SVMs) is a powerful and versatile algorithm used for classification tasks. It finds the best boundary that separates the data points into different classes, maximizing the margin. It can handle both linear and non-linear data using the kernel trick. SVMs have a high accuracy, but they can be sensitive to the choice of the kernel and the parameters of the model, and they can be less efficient with large datasets.

Neural networks: used for a wide range of tasks

Neural Networks (NNs) are a type of machine learning algorithm inspired by the structure and function of the human brain. They are a set of algorithms that are designed to recognize patterns in data, by learning from examples.

A neural network is made up of layers of interconnected nodes, also known as artificial neurons. Each neuron receives inputs, performs a computation on them, and then produces an output. The layers of neurons are connected to each other, and the output of one layer becomes the input for the next layer. The last layer produces the final output of the network.

The most common type of neural network is the feedforward neural network, also known as the multi-layer perceptron (MLP). In this type of network, the information flows only in one direction, from the input layer to the output layer.

There are other types of neural networks, such as recurrent neural networks (RNNs) and convolutional neural networks (CNNs), which are suited for specific tasks such as natural language processing and image recognition.

One of the main advantages of neural networks is their ability to learn complex, non-linear relationships in the data. They can be trained to perform a wide range of tasks, from simple linear regression to complex image recognition. However, neural networks can be difficult to train and can require a large amount of data and computational resources. Additionally, the process of understanding and interpreting the internal workings of a neural network can be challenging.

Neural networks can be implemented in various programming languages such as Python, R, and Matlab. The most popular libraries for implementing neural networks are TensorFlow, Keras, and PyTorch.

In summary: Neural Networks (NNs) are a type of machine learning algorithm inspired by the structure and function of the human brain. They are designed to recognize patterns in data, by learning from examples. They can be trained to perform a wide range of tasks, from simple linear regression to complex image recognition. They can be difficult to train and can require a large amount of data and computational resources, but they can learn complex, non-linear relationships in the data.

Conclusion

Finally, it’s important to understand the role of feature engineering in machine learning. This is the process of transforming raw data into useful inputs for a model. Feature engineering can greatly improve the performance of a model, so it’s an important step in any machine learning project.

Machine learning is a vast field with a lot to learn, but by understanding the basics and familiarizing yourself with the most common techniques, you can start to build your own models and begin solving problems with machine learning.

How to execute R script in Power BI? a comprehensive guide

Execute R script in Power BI

In this article, I am going to discuss how we can use the analytical and visualization power of the R programming language within the Power BI. We can execute R script in Power BI to create data models, prepare reports, data cleaning, advanced data shaping and analytics, missing data computation, clustering, forecasting and many other advanced tasks.

Here are some articles in this series you may be interested in

R is arguably the most preferred language by Data Scientists. It is an open-source language backed by a vast community of developers and users. It has very rich libraries to perform almost all kinds of complex analysis.

The Microsoft Power BI provides a nifty feature of integrating the power of the R language. We can import R script within Power BI and thus perform even more complex analysis.

Installing R

To activate the R script into Power BI, you need R to be installed in the same computer you are using Power BI. You can install the R package from the CRAN distribution of the R project or go for Microsoft R open distribution(MRAN).

It will be good if you install a suitable R IDE too, like R studio. The free version of R studio can serve our purpose. The IDE helps us to check the R code if it has any errors. Because correcting errors of R code within Power BI can be difficult.

If you have already installed R and R studio you may have more than one version on your computer. Check the correct option of R while using it in File -> Options and settings -> R scripting.

Checking the R version installed in your computer
Checking the R version installed in your computer

Importing data using R script

We can import data using the R script too. The “Get data” option in the Power BI helps us to import data. Here is an example of importing data I have saved in CSV format on my computer.

See the below image consisting of screenshots of all the steps from my computer. The “Other” option under “Get data” has the option to put your R script to import the data.

See the 4th step in the below image where a window gets opened and I put the following R script:

dataset <- read.csv(file="E:/test.csv", header=TRUE, sep=",")

The code is very simple, which imports a .csv file from my computer’s E:/ drive with its original header and using “,” as the separator.

Steps to import data using R script

As the given CSV file has only one dataset so the next window showed the dataset with the variable name “dataset” as mentioned by me in the script.

The R script should return at least one data frame. The Power BI creates tables from each of the data frames. If the data frame has columns containing complex or vector values, in Power BI they will display error. Any field with “N/A” will display as “NULL”.

If the R script takes more than 30 minutes to execute then it will time out. If there is any interactive field in the R script, like user input, then it may halt the script’s execution. In case your R script contains any file location, define the full path instead of providing a relative path.

Execute R script in Power BI

Here I will demonstrate executing a simple R script. My purpose is just to show the process how we can run R script inside the Power BI. So, the script will be very simple here.

The data consists of two variable X and Y. The Y is dependent on the X variable. I will run the “nuralnet" library of R to create a prediction of Y using X.

I have written the code in the external R IDE i.e. R studio and executed it there before using it in Power BI to confirm if it is free from any error.

In the Visualizations pane of the Power BI, you can see the icon for executing the R script. We need to click this icon and provide the variables we are going to use in the R script.

Writing R script
Writing R script

Unless the R script gets executed, the report view will show a blank R script visualization window. Copy the code from R IDE i.e. R studio, paste in the R script editor and run the code.

Error handling

If we have any error in our R code, the Power BI will throw some error. For example here I intentionally gave wrong dataset name. And Power BI has clearly mentioned that error in its report view. See the below image.

Error handled by the Power BI
Error handled by the Power BI

The error says that the object ‘a’ not found. This is because the object here is the dataset variable named ‘dataset’.

Once I have corrected the mistake, the script gets executed and here is the output. The neuralnet is applied and the corresponding neuralnet architecture is created.

Executing the R script within Power BI
Executing the R script within Power BI

So here is a quick overview of how we can use the power of R programming language within the Power BI desktop. R is the most preffered data science language and backed by vast community of data scientists and analysts.

I tried to provide a comprehensive guide on how to execute R script in Power BI with all relevant screenshots while doing it myself on my computer. And hope that you may find it helpful while doing the same for the first time.

So, try the steps on your own following the steps as described here. Please comment below if you have any queries or suggestions.

What are logical functions in Power BI, and how to use them?

Logical functions in Power BI

The logical functions in Power BI are essential while writing DAX expressions. Logical functions help us in decision making to check if any condition is true or false.

Once the data has been extracted through Power Query, these DAX expressions help us to fetch important information from the data. Here is an article explaining the difference between Power Query and DAX, which you may be interested in.

The logical functions in Power BI I will discuss here are IF, AND, OR, NOT, IN and IFERROR. They are all true to their names and do the task exactly as they are used in English.

I will discuss them along with their application on a data set containing the area and production of different crops of different Indian states. Below is a glimpse of the dataset.

Dataset with crop production
Dataset with crop production

I have collected the data from web with the data scraping feature of Power BI. Here is the article where I have explained how you can take advantage of this nifty feature of loading data from the web in Power BI.

“IF” logical function

The IF function accepts three arguments. The expression of this logical function is as below. We can see that it has the same English conditional context and very easy to understand.

IF (expression, True_Info, False_Info)

The first argument of this function is a Boolean expression. If this expression has some positive value the IF function returns the second argument otherwise the third argument.

Let’s see a practical example of its use on the India_statewise_crop_production dataset. I have created a new column Production_category using the IF function. If the production is less than 10, then it is under LOW production_category; otherwise HIGH production_category.

Creating new column using IF function
Creating new column using IF function

“Nested IF” function

We can use IF within another IF function, which is called the nested IF function. It helps us to check more than one condition at a time.

For example, I have placed two conditions here. One is the earlier one I used in the IF function and added another that if Production is greater than 500 then the production is HIGH else MEDIUM.

See the result below, how the Production_category column has the new values according to the NESTED IF condition.

Nested IF function
Nested IF function

“AND” logical function

It can take two arguments. If both the arguments are correct, it returns TRUE else FALSE. Its syntax is as below:

AND (Logical_condition1, Logical_condition2)

I have applied the AND function to find out if the productivity is high or low. I have used AND to check if the conditions Area is less than 10 and production is higher than 200. If both the conditions are TRUE then it returns “High Productivity” else “Low Productivity”.

Use of AND function
Use of AND function

“OR” logical function

Unlike AND logical function, in the case of OR function if anyone condition holds true, the function returns TRUE. It returns FALSE only if both of the conditions are FALSE.

For the crop production data set, I have applied the OR function to check if both the conditions that are Area<10 and Production<20 are true then it should return “Low production” else “High Production”.

Use of OR function
Use of OR function

“NOT” function

The NOT logical function simply changes FALSE to TRUE and TRUE to FALSE. It is very simple to use. See the below example.

I have used NOT with the IF function. If the IF checks the condition Season=” Kharif”, if it is true, IF returns True, again the NOT function turns it to False. See the output column “Kharif_check”, it has False corresponding to Kharif and for other entries it has True.

Use of NOT function
Use of NOT function

“IN” logical function

The IN function lets us check the specific entries under a column and calculate corresponding values for other columns.

In this example, I wanted to calculate the total production for only three states “Assam”, “Bihar” and “Uttar Pradesh”. In order to do that, I have created one measure using the SUM and IN function nested under the CALCULATE function. And see the result on a card.

Use of IN Function
Use of IN Function

“IF ERROR” logical function

The IF ERROR is another very useful logical function that checks for any error and returns values accordingly. This function is very useful while checking arithmetic overflow or any other kind of errors.

The syntax for this function is as below:

IFERROR (Value, ValueIfError )

You can get the syntax guide when you will select the function in the Power BI editor, see in the below image. As soon as I have started to type the function name, Power BI IntelliSense guided me with the autocomplete and the syntax for the function.

Use of IFERROR function
Use of IFERROR function

In my example, I have checked if there is an error in the Crop column. In case of any error found it should return “Error”. As there was no such error in the column so the IFERROR column has the exact values as in the Crop column.

How to use the “COUNT” function in Power BI?

"COUNT" function in Power BI

The COUNT() is an important function in writing the DAX formula in Power BI used. It is one of the time intelligence functions of DAX, which means it can manipulate data using time periods like days, weeks, months, quarters etc. and then use them in analytics.

We apply DAX to slice and dice the data to extract valuable information. To import data from different data sources and perform required transformations we need to know the use of Power Query. If you are curious to know the difference between Power Query and DAX, Here is an article you may be interested in.

Use of COUNT() in Power BI

The syntax for count function is very simple, we have to pass only the column name as argument like below

Measure = COUNT (Table_name [Column_name])

Count function when applied on any column, it returns the count of cells containing numbers. So it returns only whole numbers and skips the blank cells. If any cell of a column does not contain anything (string, date or numerical) then the function returns blank.

Here is an example of the application of COUNT() on the data set I have on the rainfall of different Indian states. The dataset has three columns “SUBDIVISION” containing different ecological zones of the country, “YEAR” from 1901 to 2019 and “ANNUAL” containing rainfall in mm of the corresponding year.

Application of COUNT() in Power BI

I data I have collected from the web using the data scraping feature of Power BI desktop. Here is a glimpse of the dataset.

Glimpse of the rainfall data
Glimpse of the rainfall data

First, I have created a new measure using DAX (see here how can you create a new measure in Power BI). A measure has a default name “Measure” which I have changed to “Measure_count“.

Using COUNT() in a measure
Using COUNT() function in Power BI

Here you can see COUNT() is used to get the count of ANNUAL column cells having numbers. To see the result of COUNT() I have used a “Card”. The number “4090” in the card shows the cell count of the ANNUAL column having a number.

If we change the column and replace ANNUAL with SUBDIVISION, then the count function returns “4116”. This is because rainfall of all the subdivisions are not present in the ANNUAL column. We can check the difference and know how many subdivisions and year combinations do not have rainfall data.

The COUNTA() function

If a column consists of binary values like True and False, COUNT() fails to count them. To count such values COUNT() has another version which is COUNTA(). COUNTA() is for counting any logical value or text and also the empty cells of the column.

In this data set we dont have any logical values. If COUNTA() function is applied on the same columns i.e. ANNUAL and SUBDIVISION, the results are same as COUNT() gave.

The COUNTAX() function

For those columns which have values other than strings, digits, logical values, date like formulae then there is another useful variation of COUNT() which is COUNTAX(). It returns the count of non-blank rows evaluating the result of an expression on a table.

The DAX formula for COUNTAX() is:

COUNTAX ( <table>, <expression>)

It also returns whole number and unlike COUNTA() function, it iterates through the cells of that column, evaluates the expression and returns count of nonblank rows.

Here is an example of the application of COUNTAX() on the same table. I have used this function to calculate the count of row number of ANNUAL column for a particular YEAR in the rainfall table. I have used the FILTER() function nested under COUNTAX() to filter the particular rows corresponding to the YEAR=1910 and 2010.

Application of COUNTAX()

From the above figure we can see that the COUNTAX() function has returned two different whole numbers for two different years 1910 and 2010. This is because not all the SUBDIVISION has the record of annual rainfall for the year1910.

An overview of DAX in Power BI

An overview of DAX in Power BI

As the name suggests Data Analysis eXpressions or DAX in Power BI is nothing but collection of operators, functions and constants which we use in writing formula or expressions to return value/values. It is a native language for data analytics tools of Microsoft. DAX is also a highly versatile and functional language with the capacity to work with a relational database.

DAX helps us to dig into the data we already have in our hand to explore new information. It helps us to perform dynamic aggressions, slice and dice the data. It is different from Power Query with M language at its core. Power Query performs the data extraction from different sources. Whereas DAX is applied to the extracted data source for analysis purpose.

It is very common to confuse between DAX and Power Query. You can refer this article to know a detailed comparison between Power Query and DAX.

Excel formula is similar to the DAX formula. Anyone with experience in writing Excel formula finds it easy to write DAX formula. However, DAX is far advanced than the Excel worksheet formula.

DAX is mainly used to create “Measures” and “Calculated Columns”. Below is an example of creating a measure using DAX.

Example of DAX formula

Writing effective DAX formula is the key. An effective DAX formula helps us to get the most out of the data. Writing the DAX formula in Power BI is easy. Power BI DAX editor has a smart complete feature, which automatically prompts us with probable options.

Now let’s try writing a DAX formula to perform a simple calculation. I already have a data set in the Power BI desktop on the rainfall of different Indian subdivisions. The data was scraped from the web using the data scraping tool of Power BI. You can get the details of how to do it in this article.

Below is an example of how a DAX measure has been created on the Power BI desktop. The screenshots from my Power BI desktop shows the steps of creating a measure. The purpose of the measure is to create total annual rainfall.

First of all to create a new measure, right-click on the “Fields” pane of the Power BI desktop report/data window and then choose “New measure“.

Creating new measure
Creating new measure

The default name of the measure is “Measure“. I have changed it to “Rainfall“. As you start writing the function name Power BI starts suggesting with relevant functions name. Here I have selected “CALCULATE“. It is a very popular and frequently used function of DAX.

Steps for creating a measure using DAX
Steps for creating a measure using DAX

As we enter into the “CALCULATE” function, it starts to prompt us to show that it will accept an expression followed by filters. I have selected the “SUM” function and the “ANNUAL” column of the “rainfall_india” table inside it as we want to calculate the total annual rainfall.

With this, the measure has been created. We can check the “Rainfall” measure in the “Fields” pane under the “rainfall_india” table.

Nested function in DAX

Inside the “CALCULATE” function again I have chosen the “SUM” function. This is an example of a nested function, which is a function within another function. Nested functions help us to narrow down the query to achieve the desired result.

DAX can have up to 64 nested functions. Although using this many numbers of nested functions is very uncommon as debugging of such complex functions is very tough and the execution time of such functions is also high.

Using a measure in another measure

Another useful feature of the DAX formula is it allows using a measure already created within another measure. For example, if want to further narrow down the result to calculate the total annual rainfall of any particular subdivision, we can use the “Rainfall” measure we already created. Let’s see how to do it.

For example, we want to know the total annual rainfall of the state “Kerala“. The measure “Rainfall” calculates the total annual rainfall. So, we need to provide a filter within the calculate function along with the “Rainfall” measure.

Using a measure within a measure

See the above image where I have nested one measure within another. A table and a bar chart are also created to compare the total annual rainfall and Kerala_rainfall just show how the measures are performing.

Row context and filter context of DAX

These two concepts of context are very important for the effective use of DAX. Context refers to the dynamic analysis of the data.

Row context is related to functions while applying filters to identify a single row from the table. In most of the cases, we even dont realize that we are applying the concept of row context.

Filter context is a more complex concept than row context. It applies to narrow down the data. For example, here you can see how the column “SUBDIVISION” of “rainfall_india” has filtered the context and helped us to get the annual rainfall of a particular subdivision.

An overview of Power Query in Power BI

An overview of Power Query in Power BI

Power Query in Power BI plays the role of a data connection technology. It does the data mashup i.e. connect, combine and refine data from many sources to meet the need of our data analysis.

Power Query is available in Excel 2016 or later version of Excel. It can also be added in Excel 2010 as an add-in. It is mainly used for data Extraction-Transformation and Load (ETL) in Excel worksheet or Power BI model.

ETL is something which takes the major portion of time of a data analyst. To ease this task Power Query takes raw data from the source and convert to something more workable form. This form of data is easy to analyze and to draw insights.

Data sources for Power Query

Power Query in Power BI and Excel allows us to extract data from almost any external sources and Excel itself. Here are some examples of the external sources we can bring data from. And there are many more…

Some examples of external sources power query in Power BI can bring data from
Some examples of external sources power query in Power BI can bring data from

After the data has been extracted from the desired source, Power Query helps us clean and prepare the data.

Using Power Query, we can easily append or stack different data tables. We can create relationships by merging different data tables, group and summarize using Pivot feature provided by Power Query.

The beauty of Power Query in Power BI lies in the fact that all this data transformation does not affect the original data set. The data transformation happens in the Power BI memory and we can anytime get back our old data just by removing any particular data transformation step.

Applied Steps can be managed from Query Settings
Applied Steps can be managed from Query Settings

Once we have summarized the data extracted from diverse sources, the report can be refreshed with one click. Every time new data added in the source data folder, Power Query helps us to update the report accordingly with this refresh feature.

Flow of data processing by Power Query in Power BI
Flow of data processing by Power Query in Power BI

The M language and structure of Power Query

The M language is at the core of Power Query. It is the same as the F# language, case sensitive and contains code blocks starting with "let" and "in" as shown below.

let
     <em> variable </em> = <em> expression </em> [,....]
in
     <em> variable </em>

These blocks consists prcocedural steps of declaring and defining variables. Power Query is very flexible with physical position of these logical steps. That means we can declare a variable at the begining of coding and then can define at the last.

But such a type of coding with a different logical and physical structure is very tough to debug. So, unless absolutely necessary, we should maintain the same logical and physical structure of Power Query.

Editing the Power Query

Luckily we don’t need to write the Power Query in Power BI from scratch. It is already written in the background when we perform the data transformation steps. If it is needed we can tweak the Power Query to make desired changes.

First of all, we need to open the data transformation window by clicking the “Transform data” option in Power BI. Then the Power Query can be edited using either the “Advanced Editor” or editing the code for each “Applied Steps” of “Query Settings“.

Editing the Power Query in Power BI
Editing the Power Query in Power BI

The image below consists of an example of Power Query where the data is stored in a variable called “source“. Some other variables are also declared here to store the data with different transformation steps.

The programming blocks of M language

The variables can be of any supported type with a unique name. Only if the variable name contains spaces, then the variable must contain a hashtag in the beginning and enclosed with quotes. It is the protocol of declaring Power Query variables.

How to do forecasting in power bi desktop?

Forecasting in Power BI desktop

Forecasting is predicting the future with the help of present and past data. It uses the concept of Exponential Smoothing to predict the future. The Power BI desktop has a very nifty feature of forecasting. This article will describe the process with practical data.

The data has been collected from Wikipedia using the data scraping feature provided in Power BI. I have described here how you can load the data from the web with this feature.

The data I have collected has several years of information on the monthly and annual rainfall of different regions of India. This data can be used to predict the rainfall of those particular regions for the coming years.

The data

I have selected only a single region including West Bengal and Sikkim with rainfall data from 1901 to 2016. The rainfall has been recorded in mm. On the basis of these many years data, lets try to predict what will be the rainfall of next 5 years.

Here is a glimpse of the data.

Creating the line chart

In order to apply the Forecasting feature in the Power BI desktop, we need to create a line chart first. The line chart option is available in the “Visualizations” pane of the application.

Select the “Line chart” option from visualizations and then select appropriate variables from “Fields“.

The line chart option and variable selection

The next step is selecting the “Year” as the Axis variable and “Annual” rainfall in the Values. Consequently, the line chart will be created.

Creating the Line chart

Forecasting in the Power BI desktop

Now as the line chart is ready we need to create a Forecast for future time points. In “Visualizations” pane, under “Analytics” you can get the option for “Forecast“. But unless your data has at least 60-time points the option will not be available.

Forecast tool in Power BI

Go with the default values of Forecast and click apply. Now forecast for 10 future time points is produced. As in my case each year is individual time point, so forecast for next 10 years will be produced.

The confidence interval is 95% by default. In layman language, out of 100 times the experiment conducted, 95 times the forecast will lie within the interval shown around the forecast values.

Producing forecast with default options

But you can see the forecast produced, does not appear to be very realistic. It has no similarity with the historical trend. So, something is wrong here. The seasonality is left to be selected automatically which is not working in the present case.

Seasonality in Forecasting

We need to provide appropriate value to the “Seasonality“. This parameter is the most important in the case of forecasting. So let’s try to adjust these value to get the most accurate result.

Seasonality in time series forecasting refers to a time period during which the data shows some regular and predictable changes. This period may be weeks, months or years with a cyclic pattern.

Identifying the “Seasonality” in forecasting

This cycle we can identify from the line chart we created. If we closely analyze the line chart and zoom it a little, we can notice the line repeats a pattern every 5-6 years period.

So, I will try to create the forecast with seasonality values close to 5 time points.

Checking accuracy of the forecast

To check the performance of the forecast, the forecast tool of the Power BI desktop has one feature “Ignore last”. It simply help us to produce the forecast leaving the last few points as mentioned in this field.

Which means, for this many time period we have both the observed as well as forecast values. Thus we can compare how precise the forecast is.

If we take seasonality of 4 and 6 time points, the forecast has big differences with the observed ones. See the below images. For example, for the year 2011, the actual rainfall is 2418.70mm and the forecast is 2733.56mm.

Forecast with seasonality 4

Again if we set seasonality as 6, again the forecast is very different from the original value.

Forecast with seasonality 6

But if we provide seasonality as 5 we achieve the best forecast with the closest values to the original rainfall. If we again take the example of the year 2011, with seasonality as 6 time points, the rain forecast is 2337.89mm.

The “Format” option allows us to change the style of the forecast report generated. We can change the confidence interval pattern, line pattern and colour etc.