3 min read

Data Collection, Statistics and Input Variables

Data Collection, Statistics and Input Variables

Having already discussed about selecting a sport to bet on in the tutorial series on how to build your own Betting System, it’s time to deal with the input variables of the system we are going to develop. Now, don’t panic, by saying “input variables” doesn’t mean we plan on building a nuclear reactor! We will have some data entered in our system in order to make our calculations. That data can be anything that we found useful and in referance to the sport we have already selected. Let’s see some examples…

  • The most common and widely spread data to be used in betting systems is statistics. Whether or not you are a fan of them, it is almost impossible not to use some of them at one stage of your system planning. There are team and players’ statistics depending on the sport. The statistics can include but are not restricted to: goals for %, goals against %, points won %, aces %, hit and run %, home wins %, breaks %, and they are usually expressed in percentages, although some of them are just a number. If you are going to depend on statistics, make sure you have found the necessary resources (newspapers, websites, forums, etc.) that provide them and keep them up to date.
  • Another form of data to be contained as your input variable is simple sport facts. That is for example the league standings,  consequtive wins/losses, venue, weather conditions, form and ratings. You never know if they are going to come in handy, so bookmark the sites that give them for free.
  • You may always of course make use of other information found throughout the internet. Let’s say you used to work in financial markets and need to implement your charting knowledge to your betting. The simplest thing you can do is write down the trend of betfair’s graph. How is that possible? Just by writing the first odds to be traded and the last one close to the kick off is a start. It can get a lot more complicated, but that is left to you to think about what will suit you best.
  • No, no, I haven’t forget. News, injuries, suspensions and rumours. Your favorite ones, hm? But, why have I put them last? The truth is that they are a bit difficult to be imported into a betting system that is based more or so on a mathematical model. Yet, they are admittedly quite important and can make a whole difference in the efficiency of the system. The question is how we can define them in our variables. The simplest method once again is setting a number representing those news. Say for some good news add 1. For a major team injury substract 1. For some curious rumours (oh come on, fixed games? how’s that possible?) multiply by 2. You want the final number to be as close as possible to your mental feeling.

Now that you have made up your mind on which of the above you are going to place as inputs in your betting system, sort them in columns. Remember, your database may evolve to a very big and lengthy excel spreadsheet, therefore design it with that in mind, because any change afterwards may be painful and time consuming. The first columns must have the players or teams, date, and any other information to keep track of where you stand (league, tennis tournament, etc). Then come the input variables. First column home wins %, second one home goals for %, third one rain/sunny/windy, and so on. The more inputs, the better the prediction (according to most) but also the more effort you have to make each day updating your database.

The worst thing in this stage of building your own betting system, is that you don’t really know IF the inputs you are entering are going to be useful at all. For that reason, and until you have enough sample data to work with, you’d better have as many inputs as you can. Later, you will be able to get rid of the unnecessary ones. But for now, keep them all.

Ok, input variables done. Let’s move on to see how we can use them in order to calculate probabilities and value bet.