Thursday, January 28, 2010

Scientific arguments

For scientific arguments, there is a need of scientific questions. When there is no requirement to be scientific in your answers, why would you do so?


Friday, October 16, 2009

Overfitting vs Overtraining

What is overfitting?

In statistics, the overfitting event appears when a model is just too complex and it cannot predict real data. Actually, it learned the past data very very well and it cannot generalize the model, so the real error on the data that has to be predicted would be increased.

There are many domains where overfitting event occurs. In statistics and business environment, overfitting occurs when the model is not suitable for future predictions. Let's take a look on this data changes over time:

This kind of data was collected for a period of time. Now is time to predict how would be the data to the next period. Let's suppose that those values represent the amount of items that are going to be sold by a company. The company has to know what stock to make in order to meet the customers needs. If the company overestimates the stock, more money will be spend and the items could expire without being sold, then the company will loose money. If the company sub-estimates the stock, then the customers will be angry, and many of them will look for an alternative (another company). This way, the error of the prediction should be minimized in order to get all the benefits from the business.

Choosing a suitable model is a hard thing that has to be done. In general, many parameters are involved in order to predict this such a model: trend, number of citizen, period of year, weather, etc (depending of business, items that are going to be sold).

In next picture, you'll see a bad example of prediction. The model is too simple to be used in a business environment, and the error on the training set is a big one ( more than 50%). The prediction cannot be a good one, and because the model suffers of this big error, we'll call it a sub-fitting model.


An other approach would be the other extreme, a model that fits all the elements from the past data (training data).



Why is not good this such a model?

Because it is just not useful in generating the prediction. It has no power of generalizing because it was over-trained on the dataset for a long period of time, in order to minimize the error on the training set. But the real problem is that the training set is not the same with the real data that will be acquired in the future.


Then what is the difference between overfitting and overtraining?


As you may see, the overfitting is a phenomenon which appears as a result of overtrainig, but not only. For instance, overfitting event could occur when many parameters are used to create the model. The curve of the model could be approximated with a polynomial function of grade n. If n is to big, the model is going to be over-fitted. The same results could be obtained when using neural networks with too many hidden nodes on the hidden layer. In this case, the number of adjustable parameters (weights) is increased, so, as a result, the system could learn all the points given as inputs, affecting the generalization of the problem. To be continued.

Monday, June 22, 2009

Summer moments, seaside travel

Bulgaria has become one of the most challenging seaside places from Balkans. By challenging I mean one of the most impressive places, where the tourism is practiced at a high level. This is the place where the services are done with greatness and the tourists are welcomed in many 5 or 4 stars all-inclusive hotels.

Friday, May 1, 2009

Castiga o vacanta in Bulgaria

Dacă dorești să câștigi o vacanță în Bulgaria, pentru 2 persoane (+ un copil de până în 12 ani), la un hotel foarte bun, atunci ar trebui să participi la concursul oferit de TravelPlanner.

Concursul Castiga o vacanta in Bulgaria este la a treia editie si a ajuns să fie mai complex decât previziunile făcute, având peste 4200 de participanți.

Ce ar trebui să faci?

Să îți faci un cont și să îți promovezi link-ul personal în rândurile prietenilor dar și pe blog, site, forumuri, etc, la fel cum e stipulat in regulamentul concursului.

Care este scopul?

Concursul se va da persoanei care va reuși să își promoveze cât mai bine link-ul personal, și va fi votată de cele mai multe persoane, fie și de mai multe ori (dar in zile diferite). Nu sunt admise trișări de tipul: îmi schimb ip-ul in rețea și mă votez de câte ori vreau (folosesc o clasă de ip-uri personală sau a celei din rețeaua la care sunt abonat), sau folosirea unui proxy public. Cum se realizează acest test? Se verifică "gradul de împrăștiere a ip-urilor".

Deja au fost persoane descalificate in ediția curentă a concursului. Spre exemplu, in a doua ediție a concursului, au fost persoane care au strâns 18.000 de voturi în doar o zi de concurs, asta în condițiile în care se cerea doar promovarea link-ului, nu și votul din partea persoanei. Acest lucru ar trebui să fie un semnal de alarmă pentru vânătorii de reclamă care se folosesc de sisteme de contorizare a traficului ce nu pot depista acest tip de fraudă (gândul mă indreaptă și către cei de la trafic.ro).

În ediția actuală, folosirea unor clase de ip-uri (schimbarea ip-ului la rând) și votul manual sunt la fel dezapreciate, cu toate că rămâne impresionant volumul de muncă enorm pe care o persoană poate să îl depună pentru a se vota de "câte ip-uri îl ține". Dar românul tot român (amintesc pe Vlad Țepeș care a îmbrăcat armata în haine turcești pentru că altfel, fățiș, ar fi pierdut bătălia), nu se poate desprinde de trucurile care l-ar putea ajuta să câștige cu orice preț.

Vienna 2










Sunday, March 15, 2009

Vienna




Friday, February 20, 2009

Draw a custom function in ActionScript



Here you have the sample. Fill the function input text with your custom function. You may use any standard function, sin, cost, sqrt, etc and any arithmetic function. Then you could choose the precision between 0.01 and 10 and of course you may set the scale (the default is 10).

E.g.: write down "sin(x)" function in the input text field. Press then Live Draw Function to see the results.

Source code may be viewed using "view source" context menu.

PS: You may draw some circles using the mouse on the draw surface. It's not a bug, it's a feature! :D