您的位置 > 首頁 > 商業智能 > 10 Powerful Applications of Linear Algebra in Data Science (with Multiple Resour ...

來源：分析大師 | 2019-07-23 | 發布：經管之家

If Data Science was Batman, Linear Algebra would be Robin. This faithful sidekick is often ignored. But in reality, it powers major areas of Data Science including the hot fields of Natural Language Processing and Computer Vision.I have personally seen a LOT of data science enthusiasts skip this subject because they find the math too difficult to understand. When the programming languages for data science offer a plethora of packages for working with data, people don’t bother much with linear algebra.That’s a mistake. Linear algebra is behind all the powerful machine learning algorithms we are so familiar with. It is a vital cog in a data scientists’ skillset. As we will soon see, you should consider linear algebra as a must-know subject in data science.And trust me, Linear Algebra really is all-pervasive! It will open up possibilities of working and manipulating data you would not have imagined before.In this article, I have explained in detail ten awesome applications of Linear Algebra in Data Science. I have broadly categorized the applications into four fields for your reference:I have also provided resources for each application so you can deep dive further into the one(s) which grabs your attention.Note: Before you read on, I recommend going through this superb article – Linear Algebra for Data Science. It’s not mandatory for understanding what we will cover here but it’s a valuable article for your budding skillset.I have come across this question way too many times. Why should you spend time learning Linear Algebra when you can simply import a package in Python and build your model? It’s a fair question. So, let me present my point of view regarding this.I consider Linear Algebra as one of the foundational blocks of Data Science. You cannot build a skyscraper without a strong foundation, can you? Think of this scenario:You want to reduce the dimensions of your data using Principal Component Analysis (PCA). How would you decide how many Principal Components to preserve if you did not know how it would affect your data? Clearly, you need to know the mechanics of the algorithm to make this decision.With an understanding of Linear Algebra, you will be able to develop a better intuition for machine learning and deep learning algorithms and not treat them as black boxes. This would allow you to choose proper hyperparameters and develop a better model.You would also be able to code algorithms from scratch and make your own variations to them as well. Isn’t this why we love data science in the first place? The ability to experiment and play around with our models? Consider linear algebra as the key to unlock a whole new world.The big question – where does linear algebra fit in machine learning? Let’s look at four applications you will all be quite familiar with.You must be quite familiar with how a model, say a Linear Regression model, fits a given data:But wait – how can you calculate how different your prediction is from the expected output? Loss Functions, of course.A loss function is an application of the Vector Norm in Linear Algebra. The norm of a vector can simply be its magnitude. There are many types of vector norms. I will quickly explain two of them: In this 2D space, you could reach the vector (3, 4) by traveling 3 units along the x-axis and then 4 units parallel to the y-axis (as shown). Or you could travel 4 units along the y-axis first and then 3 units parallel to the x-axis. In either case, you will travel a total of 7 units. This distance is calculated using the Pythagoras Theorem (I can see the old math concepts flickering on in your mind!). It is the square root of (3^2 + 4^2), which is equal to 5.But how is the norm used to find the difference between the predicted values and the expected values? Let’s say the predicted values are stored in a vector P and the expected values are stored in a vector E. Then P-E is the difference vector. And the norm of P-E is the total loss for the prediction.Regularization is a very important concept in data science. It’s a technique we use to prevent models from overfitting. Regularization is actually another application of the Norm.A model is said to overfit when it fits the training data too well. Such a model does not perform well with new data because it has learned even the noise in the training data. It will not be able to generalize on data that it has not seen before. The below illustration sums up this idea really well:Regularization penalizes overly complex models by adding the norm of the weight vector to the cost function. Since we want to minimize the cost function, we will need to minimize this norm. This causes unrequired components of the weight vector to reduce to zero and prev

- 【經管之家】 巨虧的曠視科技，是AI獨角獸還是物聯網企業？ 09-17
- 【經管之家】 加快數據科學項目的五個自動化工具 09-17
- 【經管之家】 從一個浪潮案例看海量數據的分級保護應用 09-17
- 【經管之家】 MathWorks推出Release 2019b，MATLAB 和 Simulink功 ... 09-17
- 【經管之家】 未來 3~5 年內，哪個方向的機器學習人才最緊缺？ 09-17
- 【經管之家】 世界頂尖數據科學家采訪實錄——Facebook人工智能研 ... 09-17
- 【經管之家】 從概念到應用，這一次終于把數據挖掘給講明白了 09-17
- 【經管之家】 全面解讀人工智能、大數據和云計算的關系 09-17
- 【經管之家】 區塊鏈政務應用研討沙龍在京成功舉辦 09-17
- 【經管之家】 MathWorks發布2019b版MATLAB和Simulink 09-17

- 【經管之家】 P2P網貸行業流量之傷與評級之傷 08-10
- 【經管之家】 財富管理論：從理財師到智能投顧 08-10
- 【經管之家】 輪回的學生貸江湖，你可懂？（下） 04-05
- 【經管之家】 互聯網票據理財之二：風險辨識不容易 03-30
- 【經管之家】 互聯網票據理財之一：業務運作模式詳解！ 03-29
- 【經管之家】 輪回的學生貸江湖，你可懂？（上） 03-29
- 【經管之家】 “三引擎”驅動傳統銀行轉型 11-25
- 【經管之家】 互聯網金融的最大意義在于實現金融的民主化和大眾化 11-24
- 【經管之家】 響應曲面設計 11-18

京ICP備11001960號 京ICP證090565號 京公網安備1101084107號 論壇法律顧問：王進律師知識產權保護聲明免責及隱私聲明 主辦單位：人大經濟論壇 版權所有

聯系QQ：2881989700 郵箱：[email protected]

合作咨詢電話：(010)62719935 廣告合作電話：13661292478（劉老師）

投訴電話：(010)68466864 不良信息處理電話：(010)68466864

聯系QQ：2881989700 郵箱：[email protected]

合作咨詢電話：(010)62719935 廣告合作電話：13661292478（劉老師）

投訴電話：(010)68466864 不良信息處理電話：(010)68466864