Monday, November 26, 2018

TensorFlow


TensorFlow !!

one more cool yet powerful framework/library to develop ML programs.

The underlying basis for this is a directed graph. The data otherwise called edges flows through nodes which are otherwise called operations.

TensorFlow normally works in a lazy way i.e., it builds the graphs and execution happens later during the programming although, there is an 'eager' way to execute in runtime mode. And, the TensorFlow programming happens at "estimator" api level. Or there is another high level API called 'keras' which can be used to define models and execute at ease.

This is how it can be installed..

conda - package manager can be used for this. update conda and then use conda to upgrade all the packages if you are not sure which ones are dependent packages like numpy etc.,
And, then install python packages using pip3 to install/upgrade tensorflow and keras.


Sunday, November 11, 2018

ML how to..



How to approach an ML problem.. here is what I think


1. Take a look at the data and decide what do you want to do like, what to predict - Explore the data

2. Decide what type of ML problem is it - unsupervised / supervised (classification / regression)

3. Clean up the data .. remove unwanted data, noise, decide on features and create features, normalize feature values - data cleanup and feature engineering

4. Create data sets - training/validation/test data sets

5. Train the model - supply train and validation sets, choose the model parameters, loss function, number of iterations etc., - Create and train a model

6. evaluate the error on training set and validation set. Tune the parameters and re-train until the error is close on both training and validation sets - model validation and tuning

7. Check the model prediction on the test data set which is the data set that the model was not aware of the output label value in the above phases - final check

8. all good ? release to live data. If not, repeat above by revisiting input features and model and model parameters to get a better fit. - going live




Saturday, November 10, 2018

Machine Learning




Machine Learning - as the name suggest, it is about teaching a machine to learn about something. The crust is about how to teach a machine.. the better you teach the better it learns.

How we teach a machine ? well, remember the word 'maths' ? it is every where.. most of the logical problems are solved through maths and that is how machines work. so, we teach the same way.. to do that, it is required to collect 'enough' data sets and feed it to a machine asking to fit a mathematical equation. By doing so, the machine learns about the data and its dynamics .. once it is done, the machine would be able to predict the outcome for a future data point.

the first step in doing so, is identify what kind of ML problem is it. Generally, an ml problem falls into either of the two categories or sometimes both ! depends on what you want..

                                                                ML Problem
                                                                 /              \
                                                                /                \
                                                   supervised                unsupervised
                                                   /            \                                |
                                                  /              \                               |
                                        classification    regression           grouping/clustering


supervised problems are the cases where the data sets contains both input (features) and output (label). The teaching process will be carried out in guidance with the output so that, ml algorithm will know how is it doing how to evaluate the model so that it better fits to the data.

This is further categorized into two types. 'classification' is type of problem where the output is basically classified into two or a few unique values and the goal is to fit a model so that, the future data point will be classified into the given sets. like what is the number that a given picture contains..

Regression is the second type of supervised problems where the output variable is a continuous label instead of limited/discrete value. like whether your favorite team wins a match against a given opponent.


Unsupervised problems are the cases where the outcome is not really known. So, the goal is to fit a model in order to identify the clusters in the data sets.

There are lot of ML Algorithms and libraries and frameworks available as opensource and as commercial products in the market already..




Friday, January 12, 2018

Web Application's Navigation Paths and Page Performance


I have been trying to find out a better way to visualize and better understand how a web application is being accessed by the end users and how the end user journey looks like and which pages are indirectly became the close groups based on how the end users are distributed and their interests in the application features. And, finally how smooth is the path between pages..

And, stumbled on a wonderful visualization tool called Gephi. And, below is how a sample website looks like depicting the user navigation paths, the page grouping and page performance which is characterized by the weights of the path between the pages. In the graph terminology, the pages are nodes and the paths between pages are called edges which are attributed by response times as weight.