
LightningChart JSMachine Learning and Artificial Intelligence in Data Visualization (Part I)
ArticleLearn more about Machine Learning, models, unsupervised and supervised learning, and algorithms prior to developing and add-on for LightningChart JS
Written by a human | Updated on April 11th, 2025
Introduction
Artificial intelligence and machine learning have become increasingly important in various areas, including healthcare, public safety, banking, and agriculture. Machine learning has simplified and accelerated the analysis of large volumes of data, making it an essential method for working with data. Among the many beneficiaries of machine learning, data science is undoubtedly one of the main areas.
As a subset of data science, data visualization is another essential field that can translate complex data structures and patterns into a more understandable form for humans. Data visualization is used on a daily basis in a wide range of fields, including logistics, science, finance, healthcare, marketing, and education. However, not everyone is prepared to use machine learning methods in their work, mainly due to the complexity of implementation and the high cost of hiring specialists in this field.
The main goal of this article is to simplify the process of working with machine learning for users with no prior experience in the field. The author of the article will collaborate with LightningChart Oy to develop an add-on to the LightningChart JS library, allowing users to work with machine learning models directly in their browser without requiring in-depth knowledge of the field. The article will also demonstrate the library’s real-world use cases for machine learning.
Structure & Goals
The article is divided into 5 chapters, each describing different stages of the article project.
• Chapters 1 and 2 are the preparation for the implementation of the project. They describe the goals of the project, analysis of the relevance of the project, as well as the necessary background information.
• Chapter 3 demonstrates the classes developed as a result of the article project and explains their main methods.
• Chapters 4 and 5 conclude the article report by showing the use of implemented classes on real-world examples and evaluating the work.
The key goal of the article is to create an add-on for the LightningChart JS library that will allow users to use machine learning methods in the browser. To achieve this goal, the following side goals must be met:
• Collect and study necessary information.
• Create code with JavaScript and TensorFlow JS.
• Test this code on real-world examples with the LightningChart JS library.
Project Relevance
Machine learning technology is now more relevant than ever and has a huge number of applications. One of the main goals of this project is to be able to apply it in the largest number of areas of human life. To achieve this, the author uses some of the most popular machine learning models. The following are some of the possible applications of this project:
Finance
Forecasting sales: Linear regression can be used to model the relationship between sales and various predictors such as advertising spending, seasonality, and pricing. KNN can be used to predict sales based on the past sales patterns of similar products or customers.
Stock prediction: Linear regression can be used to model the relationship between stock prices and various economic indicators such as interest rates, GDP, and inflation. Logistic regression can be used to predict whether a stock will go up or down based on historical price movements. KNN can be used to predict stock prices based on the past price patterns of similar stocks.
Marketing
Consumer behavior prediction: Logistic regression can be used to predict whether a customer will buy a product based on various demographic, behavioral, and psychographic factors. KNN can be used to predict customer behavior based on the past behavior patterns of similar customers.
Security
Fraud identification: Logistic regression can be used to predict the likelihood of fraud based on various factors such as transaction amount, location, and time. KNN can be used to identify fraudulent transactions based on the past transaction patterns of similar customers.
Other
Spam detection: regression can be used to classify emails as spam or not based on various features such as sender, subject, and content. KNN can be used to classify emails based on the past classification patterns of similar emails.
Weather forecasting: Linear regression can be used to model the relationship between various meteorological factors such as temperature, humidity, and pressure. KNN can be used to predict weather conditions based on the past weather patterns of similar regions.
Machine Learning
Machine learning is a field of Artificial Intelligence that uses algorithms to learn from historical data. While there are many different definitions of machine learning, this is a general description of its main purpose.
With the help of machine learning, AI systems can analyze data, memorize information, make predictions, reproduce pre-made models, and choose the most suitable option from the proposed choices.
Machine learning is particularly useful in situations that require a large amount of computation, such as bank scoring, analytics in the field of marketing and statistical research, business planning, demographic research, investments, and the identification of fake news and fraudulent sites.
In these and other areas, machine learning algorithms can significantly simplify data analysis, speed up decision-making processes, and reduce the potential for human error. Machine learning has become an increasingly important tool for many businesses and organizations, as it can help them to uncover valuable insights, gain a competitive advantage, and make more informed decisions.
Types of Machine Learning
There are many subsets of machine learning, each with its own optimal use case, data shape, and training methods. In this context, four main types of machine learning can be highlighted below.
Supervised Learning
Supervised machine learning requires a dataset with labeled inputs and desired outputs. Once trained, predictions about new observations can be made. Supervised learning can be divided into two subtypes: regression and classification. In the first case, the output is a continuous value, while in the second case, the output is a discrete value or a probability.
Unsupervised Learning
Unsupervised machine learning algorithms use unlabeled data for training. They do not predict the output like supervised learning, but instead explore the dataset, searching for meaningful connections and describing hidden structures in unlabeled data. Depending on the model’s objectives, unsupervised learning can be used to group data in different ways.
The most common application of unsupervised learning is called clustering. Clustering models look for similar data and group it together. Another type of unsupervised learning model is called anomaly detection. These models look for unusual patterns in a dataset.
The last example of unsupervised learning is association models. They use several key data point features to predict other features they are associated with.
Semi-supervised Learning
Semi-supervised machine learning uses both labeled and unlabeled data for training. The training dataset consists of a small amount of labeled data and a huge amount of unlabeled data. The machine learning algorithm groups similar data together, thus helping to label unlabeled data. This method is especially effective when labeling the data and extracting the desired features from the dataset is costly and time-consuming.
Reinforcement Learning
Reinforcement machine learning works by setting an algorithm with a distinct goal. This algorithm attempts to find the optimal path to a given goal. If it takes actions that are beneficial to the goal, it receives a reward; if it takes action that moves it away from the goal, it receives a punishment. Thus, the algorithm tries to solve the problem by getting as many rewards as possible, while avoiding punishments.
Choosing Machine Learning Models
Not all types and models of machine learning are suitable for this project. Therefore, when choosing machine learning models, five factors were considered:
Model popularity
According to Matthias Döring (2018), supervised learning models are used in more than 50% of cases, with linear regression and logistic regression ranked first and second respectively. Neural networks make third place, followed by decision trees and SVMs.
Suitability of the model for visualization
By suitability of the model for visualization, the ability to visualize as many steps in the model as possible is meant. The most important is the ability to visualize the input and output of the model. And secondary is the ability to visualize the parameters and metrics of the model.
Hardware requirements
According to Himanshu Singh (2019), a lot of computing power can be required to train a machine-learning model. High-performance GPUs and CPUs may be required to complete some tasks in a reasonable amount of time. Since the LightningChart JS library can run on a wide range of devices from low-end to high-end, machine learning models with high hardware load will be excluded from selection.
Amount of input
Since the goal of this project is to make machine learning models easier to use, the user should not be overwhelmed by the many functions and parameters.
Therefore, it is desirable to minimize the amount of input required from the user.
The complexity of the model
What stops many from using machine-learning models, is the difficulty of understanding its algorithms. Thus, being able to understand the learning process and how the model generates its outcome plays an important role in building trust in machine learning. That is why an important factor in choosing a model for this project is the ability of the model used to understand it and explain the principles of the algorithm to others.
Figure 1. The popularity of supervised machine learning models across different fields (Matthias Döring, 2018).
Final Model Selection
Considering all the above factors, for the implementation of the project, it was decided to choose the two most popular supervised learning models: linear regression and logistic regression. These models meet all the above points. Also, the choice of these particular models was due to the fact that it will allow to demonstration of both types of supervised learning – regression and classification.
It was also decided to include the K-Nearest Neighbor algorithm in the project. This simple and versatile supervised learning algorithm can solve both regression and classification problems. It is good for working with a small amount of data. It was chosen to introduce the user to machine learning.
Machine-Learning Algorithms
This section explains the machine learning algorithms used in this project.
K-Nearest Neighbor Algorithm
The basic principle on which the K-Nearest Neighbor algorithm works is the assumption that similar things exist next to each other. This allows the KNN algorithm to find patterns between features and labels (independent and dependent data).
KNN finds distances between features of query data and features of data points from the prepared dataset. Then it sorts calculated distances from smallest to largest and selects the top K data points closest to the query point. Further action of the algorithm will differ depending on the type of problem.
In the case of regression, the algorithm calculates the mean of the labels of the nearest K data points. And in the case of classification, the algorithm returns the most frequent label of the nearest K data points.
FIGURE 2: K-Nearest Neighbor algorithm.
K-Nearest Neighbor is a simple and versatile algorithm that does not require building a model and tuning a large number of parameters. But the main disadvantage of KNN is its low performance with a large amount of data.
Solving Linear Regression with Gradient Descent Algorithm
To implement Linear Regression and Logistic Regression, the Gradient Descent algorithm will be used. This is a popular algorithm that is used in many Machine Learning models.
Linear Regression is a supervised machine learning model that attempts to find a relationship between a pair of variables by fitting a linear equation to historical data. For example, Figure 3 shows a linear function that fits perfectly into the observed dataset. This is what the desired result of the Gradient Descent algorithm looks like.
FIGURE 3: A linear function that fits historical data (Dhanoop Karunakaran, 2020).
The gradient descent algorithm starts with a random linear function and improves it until the best-fitting function is found.
Mean Squared Error formula
The equation is used to adjust the function. MSE is the average squared difference between the guess value and the actual value.
The MSE needed to determine how well the current linear function fits the data. The lower the MSE, the closer the function is to the ideal result.
Let’s assume that the linear function looks like f(x) = mx + b, in which case such values of m and b should be chosen so that the MSE of the resulting function is as close as possible to the minimal MSE value.
So, to understand how far the current function is from the ideal one, it is necessary to find the minimum value of MSE. This can be achieved by differentiating the MSE and setting it equal to 0. And since in the case of linear regression MSE depends on two variables, two partial derivatives of the MSE function should be computed. The first partial derivative of MSE:
(Polyanin 2008, 2) should be calculated with respect to m, and the second partial derivative of MSE with respect to b:
Computing partial derivatives reveals two important facts about the current guess. The first fact is the distance of the current guess MSE to 0. The smaller this distance, the closer the current guess is to the optimal values of m and b.
And the second fact is the sign of partial derivative. Knowing the sign of partial derivatives, it becomes clear in which direction to change the values of m and b.
Now that the current guess has been analyzed using partial derivatives, the m and b values can be adjusted. To do this, the Gradient Descent algorithm uses the so-called learning rate. The learning rate is a parameter that determines the step size at which the model parameters are updated during training. With it, the values of m and b are adjusted using the following formulas:
Further, the algorithm repeats the above actions, adjusting the MSE value. It should be noted that it is rarely possible to achieve the minimum value of MSE. Therefore, the gradient descent algorithm should be stopped when it no longer makes meaningful changes to the MSE value.
FIGURE 4: Gradient Descent Algorithm.
Solving Logistic Regression with Gradient Descent Algorithm
Logistic regression is a classification-supervised machine learning model that estimates the probability of an event occurring by finding the relationship between independent and discrete variables. Although a gradient descent algorithm will also be used to implement this model, there are several differences in its implementation compared to linear regression.
The first difference is that events whose probability is calculated using logistic regression must be represented in discrete form. For example, when calculating the probability of rain, rainy weather can be represented as 1, and non-rainy weather as 0.
The second difference is that in the case of logistic regression, a sigmoid function is used instead of a linear function. This is because a linear function does not fit well on discrete data for two reasons. The first reason is that since the linear function goes beyond 0 and 1 on the y-axis, it is impossible to get the probability as an output. The second reason is that leverage points (data points with an unusual independent value) can affect the linear function too much.
It is also necessary to consider that in the case of logistic regression, the relationship between MSE and the values of m and b is not a parabola. Therefore, updating the learning rate depending on the change in MSE may result in locating a local minimum instead of a global one. This will lead to the fact that the algorithm will not always find the optimal value of m and b. To avoid this problem Cross Entropy (cost function) is used instead of MSE.
FIGURE 5: Linear Regression and Logistic Regression examples (Ashish Mehta, 2020).
Conclusion
In conclusion, choosing the best LightningChart .NET XY line series for your project requires careful consideration of several factors such as performance, customization options, and ease of use.
It’s important to assess your project’s specific needs and requirements, and then evaluate each line series based on its ability to meet those needs.
Whether you’re working on a high-performance financial application, or a simple data visualization project, LightningChart offers a range of XY line series options to suit your needs.
By taking the time to evaluate the different options available and selecting the best one for your project, you can ensure that your application will perform optimally and provide valuable insights to your users.
Cleaning Memory Resources Correctly
Cleaning Memory Resources Correctly
High-Performance WPF Charts : The Truth
What about manufacturers’ claims about Fastest rendering charts? There are a lot of false marketing terms used in the industry, so we are going to tell the truth, based on facts that anyone can reproduce and verify.
No Results Found
The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.
