Domanda |
Risposta |
what is supervised learning inizia ad imparare
|
|
machine learning task of inferring a function from labeled training data
|
|
|
inizia ad imparare
|
|
machine learning explores the study and construction of algorithm that can learn from and make predictions on data
|
|
|
give examples of supervised learning algorithms inizia ad imparare
|
|
support vector machines, regression, naive bayes, decision trees,
|
|
|
what is unsupervised learning inizia ad imparare
|
|
type of machine learning algorithm used to draw interferences from datasets consisting of input data without labeled responses
|
|
|
give example of unsupervised learning algorithms inizia ad imparare
|
|
clustering, anomaly detection, k-means for clustering
|
|
|
what are the various classification algorithms inizia ad imparare
|
|
decision trees, svm, logistics regression, naive bayes
|
|
|
what is logistics regression? inizia ad imparare
|
|
Is a technique to predict the binary outcome from linear combination of predictor variables
|
|
|
what is linear regression? inizia ad imparare
|
|
statistical technique where the score of a variable Y is predicted from the score of second variable X. X is referred to as the predictor variable and Y as criterion variable
|
|
|
inizia ad imparare
|
|
supervised ml. both regression and class. svm uses hyper planes to separate out different classes based on the provided kernel function
|
|
|
what are the different kernels functions in svm inizia ad imparare
|
|
linear, polynomial, radial basis, sigmoid
|
|
|
inizia ad imparare
|
|
each tree gives a classification. the forest chooses the classification having the most votes. in regression it takes average
|
|
|
inizia ad imparare
|
|
For regression and classification. it breaks down a data set for a smaller subsets while at the same time an associated decision tree is incrementally developed. the final result is a tree with nodes and leafs
|
|
|
inizia ad imparare
|
|
an iterative technique which adjust the weight of an observation based on last classification. if an observation was classified incorrectly, it tries to increase the weight of this observation and vice versa
|
|
|
inizia ad imparare
|
|
2x2 Table contains 4 outputs provided by binary classifier. various measures, such as error-rate, accuracy, specificit, sensitivity, precision and recall
|
|
|
inizia ad imparare
|
|
TP/(TP+FP) precision is a good to determine, when the costs of FP is high.
|
|
|
inizia ad imparare
|
|
TP/(TP+FN) recall shall be the model metric when there is a high cos associated with False Negative
|
|
|
inizia ad imparare
|
|
2x(precision x recall) /(precision + recall) might be a better measure to use if we need to seek a balance between Precision and Recall AND there is an uneven class distribution(large number of Actual Negatives)
|
|
|
inizia ad imparare
|
|
statistical analysis of data. it is used to select, manipulate, and examine a representative subgroup of data points that allow you to identify trends
|
|
|
inizia ad imparare
|
|
unrepresentative sample of data. it is when the data that has been mined, cleaned, and prepared for modeling is not illustrative of the data that the model will see once it is in use
|
|
|
inizia ad imparare
|
|
adds penalty to a model as complexity increases. this prevents overfitting.
|
|
|
inizia ad imparare
|
|
bias is error introduced in your model due to over simplification of ML algorithm. it can lead to under fitting
|
|
|
inizia ad imparare
|
|
Variance is error introduced in your model due to complex ML algorithm, your model learns noise also from the training data set and performs bad on test data set. it can lead high sensitivity and overfitting
|
|
|
inizia ad imparare
|
|
Gradient is the direction and magnitude calculated during training of a neural network that is used to update the network weights in the right direction and by the right amount
|
|
|
explain how ROC curve works inizia ad imparare
|
|
the roc is a graphical representation of the contrast between TP rates and FP rates at various thresholds.
|
|
|