Sebastian Pokutta's Blog

Mathematics and related topics

Archive for the ‘Applications of Operations Research’ Category

Long time no see

with one comment

It has been quite a while that I wrote my last blog post; the last one that really counts (at least to me) was back in February. As pointed out at some point it was not that I was lacking something to write about but more that I did not want to “touch” certain topics. That in turn made me wonder what a blog is good for when, in fact, one is still concerned about whether to write about certain topics. So I got the feeling that in the end, all this web 2.0 authenticity, all this being really open, direct, authentic, etc. is nothing else but a (self-) deception. On the other hand, I also did not feel like writing about yet another conference. I have to admit that I have been to some really crappy conferences lately and since I did not have anything positive to say I preferred to not say anything at all. There were a few notable exceptions, e.g., the MIP or IPCO. Another thing that bothered me (and still does) is the dilution of real information with non-sense. In fact I have the feeling that the signal-to-noise ratio considerably dropped over the last two years and I didn’t want to add to this further. This feeling of over-stimulation with web 2.0 noise seems to be a global trend (at least this is my perception). Many people gave up their blogs or have been somewhat neglecting them. Also maintaining a blog with say weekly posts (apart from passing on a few links or announcements) takes up a lot of time. Time that arguably could be better invested into doing research and writing papers.

Despite those issues or concerns I do believe that the web with all its possibilities can really enhance the way we do science. As with all new technologies one has to find a modus operandi that provides positive utility. In principle the web can provide an information democracy/diversification, however any “democratic endeavor” on the web has a huge enemy. The Matthew effect (or commonly known as “more gains more”). This term, coined by R.K. Merton, derives its name from the following biblical Gospel of Matthew (see also wikipedia):

For to all those who have, more will be given, and they will have an abundance; but from those who have nothing, even what they have will be taken away. — Matthew 25:29, New Revised Standard Version

In principle it states that the “rich get richer” while the “poor get poorer”. If we think of the different social networks (facebook, myspace, friendster) it refers to the effect that the one that has the largest user basis is going to attract more people than the one with a smaller one. In the next “round” this effect is then even more pronounced until the smaller competitor virtually ceases to exist. In the real-world this effect is often limited due to various kinds of “friction”. There might be geographic limitations, cultural barriers etc., that do wash out the advantage of the larger one so that the compounding nature of the effect is slowed down or non-existent (this hold even true in the highly globalized world we live in). That is the reason why dry cleaners, bakeries, and other forms of local business are not outperformed by globalized companies (ok, some are). In the context of the internet however there is often no inhibitor to the Matthew effect. It often translates into some type of preferential attachment although with the difference that the overall user basis is limited so that the gain of one party is the loss of another (preferential attachment processes are usually not zero-sum).

So what does this mean in the context of blogs? Blog reading is to a certain extent zero-sum. There is some limited amount of time that we are willing to spend reading blogs. Those with a large user basis will have more active discussions and move higher in the priority list for reading. In the end the smaller ones might only have a handful of readers making it hard to justify the amount of time spent writing the posts. Downscaling the frequency of posts might even pronounce the effect as it might be perceived as inactivity. One way out of this dilemma could be any form of joining the smaller units to larger ones, i.e., either “digesting” several blogs to a larger one or alternatively “shared blogging”. I haven’t made up my mind yet what (if!) I am going to do about this. But I guess, in the end some type of consolidation is inevitable.

Having bothered you with this abstruse mixture of folklore, economics, and internet, I actually intended to write about something else but somewhat related today: About deciding whether and when to dump a project. This problem is very much inspired by my previous experiences as a consultant and recent decisions about academic projects. More precisely, suppose that you have a project and you have an estimate for the overall time of the project. At some point you want to review the progress and based on what you see at this point you want to make a call whether or not you will abandon the project. The longer you wait with your review the better your information is that you gain from the review. On the other hand you potentially wasted too much time and resources to increase the confidence in your decision. In fact it might make even sense you not start a project at all. Suppose that you have an a priori estimate for the probability of success of your project, say p. Further let r(t) denote our function of erring, i.e., r(0) = 1/2 and r(1) = 0 which means that at time t= 0 we do not have any information yet and so we can only guess leading to guessing wrong with probability 50% and at time t = 1 we have perfect information. Let t denote the point in time at which we review the project (as a fraction of the overall time, here assumed to be 1). We have four cases to consider (one might opt for a different payoff function; the following one resembles my particular choice):

  1. The project is going to be successful and at the point of reviewing we guessed right, i.e., we went through with it. In this case the benefit is s. This happens with probability (1-r(t)) p and expected payoff for this scenario is: (1-r(t)) p s. [alternatively one could consider the benefit s – t; or something else]
  2. The project is going to be successful and at the point of reviewing we guessed wrong, i.e., we dropped the project. In this case the benefit is – (t + s), i.e., we lose our investment up to that point (here with unit value) and the overall benefit. Probability is r(t) p and expected payoff – r(t) p (t+s).
  3. The project is going to fail and we guessed right: Benefit -t, i.e., the investment so far. Expected payoff – (1-r(t)) (1-p) t.
  4. The project is going to fail and we guessed wrong, i.e., we went through with it: Benefit -T, were T is some cost for this scenario. Expected payoff – r(t) (1-p) T.

All in all we have the following expected overall payoff as a function of t:

\mathbb E(t) = -[(1-r(t))p (-s) + (1-r(t))(1-p) t + r(t)p(t+s) + r(t)(1-p) T]

Next we have to define our function which models our increase in confidence. I opted for a function that gains information in a logarithm fashion, i.e., in the beginning we gain confidence fast and then we have a tailing-off effect:

r_k(t) := \frac{1}{2} \frac{(1 + \log(k)}{(-\log(k) + \log(1 + k)))} - \frac{\log(k + t)}{(2 (-\log(k) + \log(1 + k)))}

The parameter k can be understood as the rate of learning. For example for k = 0.01 it looks like this:


Assuming s = 1 and T = 1, i.e., the payoffs are basically the invested time and p = 30%, the expected payoff as function of the time of review t looks like this (payoff: blue line, error rate: red line):

The maximum payoff is reached for a review after roughly 20% of the estimated overall time. However it is still negative. This suggests that we do not learn fast enough to perform a well-informed decision. For example for k = 0.001, the situation looks different:

The optimal point for a review is after 14% of the estimated project time. Having once estimated your rate of learning, one can also determine which projects one should not get involved with at all. For k = 0.001 this is the case when the probability of success p is less than roughly 27%.

Although this model is somewhat very simple it provides some nice qualitative (and partly quantitative) insights. For example that there are indeed projects that you should not even get involved with; this is somewhat clear from intuition but I was surprised that the probability of success of those is still quite high. Also, when over time your rate of learning increases (due to experience with other projects) you can get involved with more risky endeavors because your higher review confidence allows you to purge more effectively. For example when k goes down to, say, k = 0.00001 (which is not unrealistic as in this case shortly after the beginning of the project you would only err with around 20%) you could get involved with projects that only have a probability of success of 19%.

And no complaints about the abrupt ending – I consumed my allocated blogging time.

Advertisements

Sikuli – a scripting language based on screenshots

leave a comment »

There is a pretty cool, new scripting language, Sikuli which is a “research project developed by User Interface Design GroupMIT Computer Science and Artificial Intelligence Laboratory (CSAIL)“. The cool thing is that you can automate GUI interactions by using screenshots of buttons, sliders, input boxes, etc. But rather before I start to explain how it works, just check out the video.

The language itself is based on Jython and thus you can use Jython to do almost everything you like and it is fairly simple to use:


Example: Sikuli Code

So now you finally can automate your optimal FarmVille production schedule by having Sikuli interact with the applet ;-). Just put your production schedule into Sikuli and let it do the rest. Here is a GMPL model (based on the AMPL model from O.R. by the Beach – I just changed “;” and removed the “option” lines) for determining the optimal production schedule that maximizes cash. Can be solved to optimality with GLPK in 0.1 secs – the Sikuli code is left as an exercise 😉


param T;
set Horizon := 0..T;
set Crops;

param TotalArea;
param InitialArea;
param InitialMoney;
param PlowCost;
param Growth{Crops};
param Cost{Crops};
param Revenue{Crops};

var Plant{Crops,Horizon} integer >= 0;
var Area{Horizon} >= 0, <= TotalArea;
var Money{Horizon} >= 0;

maximize z: Money[T] + 4*PlowCost;

subject to

area0: Area[0] = InitialArea + sum {i in Crops} Plant[i,0];

area{t in 1..T}:  Area[t] = Area[t-1]
      + sum {i in Crops} Plant[i,t]
      - sum {i in Crops : t >= Growth[i]} Plant[i,t-Growth[i]];

money0: Money[0] = InitialMoney - sum {i in Crops} (PlowCost + Cost[i])*Plant[i,0];

money{t in 1..T}: Money[t] = Money[t-1]
      - sum {i in Crops} (PlowCost + Cost[i])*Plant[i,t]
      + sum {i in Crops : t >= Growth[i]} Revenue[i]*Plant[i,t-Growth[i]];

solve;

display z;
display {i in Crops, t in Horizon : Plant[i,t] > 0} Plant[i,t];
display Area,Money;

data;

param T := 36;
set Crops := SB EP WH SY;
param TotalArea := 144;
param InitialArea := 0;
param InitialMoney := 323;
param PlowCost := 15;

param:

Growth   Cost   Revenue :=
SB      1      10       35
EP     12      25       88
WH     18      35      115
SY      6      15       63;

end;

Written by Sebastian

February 7, 2010 at 8:33 pm

Microbes outsmarting engineers/mathematicians (reloaded)

with one comment

A paper on solving combinatorial optimization problems with biological organisms made it into Science. The paper “Rules for Biologically Inspired Adaptive Network Design” by Atsushi Tero, Seiji Takagi, Tetsu Saigusa, Kentaro Ito, Dan P. Bebber, Mark D. Fricker, Kenji Yumiki, Ryo Kobayashi, and Toshiyuki Nakagaki explains that under certain circumstances, a certain microbe can reconstruct a version of the Tokyo rail system. From the abstract:

Transport networks are ubiquitous in both social and biological systems. Robust network performance involves a complex trade-off involving cost, transport efficiency, and fault tolerance. Biological networks have been honed by many cycles of evolutionary selectionpressure and are likely to yield reasonable solutions to such combinatorial optimization problems. Furthermore, they develop without centralized control and may represent a readily scalable solution for growing networks in general. We show that the slime mold Physarum polycephalum forms networks with comparable efficiency, fault tolerance, and cost to those of real-world infrastructure networks—in this case, the Tokyo rail system. The core mechanisms needed for adaptive network formation can be captured in a biologically inspired mathematical model that may be useful to guide network construction in other domains.

Whereas the authors do not claim any real (i.e., mathematical) optimality etc. and probably the solution quality is similar to solutions that one would obtain with simulated annealing or ant colony optimization, i.e., sub-optimal solutions in many cases, a well-read website, Spiegel Online belonging to the Spiegel magazine run a story about the article a couple of days ago giving the whole thing a slightly different touch – this reminded me very much of a blog post of Mike. The article starts with (in German – I will provide a hopefully faithful translation afterwards):

Was Ingenieure mit großem Aufwand versuchen, scheint für den Schleimpilz Physarum polycephalum eine Kleinigkeit: Verkehrswege möglichst effizient zu bauen.
Translation: What Engineers try to achieve with a lot of effort, is a trivial task for the slime mold Physarum polycephalum: The construction of efficient (traffic-) networks.

Here, the witty author already gives a first indication and an executive-style summary of his opinion. And just in case you haven’t got the point that engineers and in particular discrete optimization people are superfluous, money-eating artifacts of times where excess capital had to be burnt in order to keep inflation low, the author stresses his point even further by making clear that the microbe is really dumb.

Er gehört zu den ältesten Lebensformen – und Intelligenz würde man dem schleimigen Winzling wohl kaum zusprechen.
Translation: It (the microbe) belongs to one of the oldest life forms on earth and this little slimy thing is far from being intelligent.

If one takes a closer look at the article and especially the graph of the underlying network that the magic microbes reconstructed, the graph is almost trivial (from Fresh Photos):

If you checkout the spiegel online article which also provides pictures (I didn’t want to include those for obvious reasons), the time scale shows that obtaining the final solution took the magic microbe 26 hours. Needless to say that probably trivial branch-and-bound would do the job in less than 10 secs.

In times where it is chic to suck at mathematics, a stupid microbe outperforming a whole branch of engineering and mathematics provides perfect justification for dismissing mathematics as the most decadent form of barrenness. And what is really questionable, is that in times where mathematical illiteracy is responsible to a large extent (together with greed and dismissal of the principle of universality) for the latest financial meltdown there are still authors that somewhat give the impression that the underlying mathematical problems are trivial and provide a jaded view.

We need a wikipedia page for that, distinguishing folklore optimization (yeah, every pseudo-consulting outlet does optimization) and mathematical optimization. Something that really distilles the point. Unfortunately, this page here is not really informative for the non-expert!!

Let us have a securitization party

with one comment

The concept of securitization is very versatile. From Wikipedia:

Securitization is a structured finance process that distributes risk by aggregating debt instruments in a pool, then issues new securities backed by the pool. The term “Securitisation” is derived from the fact that the form of financial instruments used to obtain funds from the investors are securities. As a portfolio risk backed by amortizing cash flows – and unlike general corporate debt – the credit quality of securitized debt is non-stationary due to changes in volatility that are time- and structure-dependent. If the transaction is properly structured and the pool performs as expected, the credit risk of all tranches of structured debt improves; if improperly structured, the affected tranches will experience dramatic credit deterioration and loss. All assets can be securitized so long as they are associated with cash flow. Hence, the securities which are the outcome of Securitisation processes are termed asset-backed securities (ABS). From this perspective, Securitisation could also be defined as a financial process leading to an issue of an ABS.

The cash flows of the initial assets are paid according to seniority of the tranches in a waterfall-like structure: First the claims of the most senior tranche are satisfied and if there are remaining cash flows, the claims of the following tranche are satisfied. This continues as long as there are cash-flows left to cover claims:

Individual securities are often split into tranches, or categorized into varying degrees of subordination. Each tranche has a different level of credit protection or risk exposure than another: there is generally a senior (“A”) class of securities and one or more junior subordinated (“B,” “C,” etc.) classes that function as protective layers for the “A” class. The senior classes have first claim on the cash that the SPV receives, and the more junior classes only start receiving repayment after the more senior classes have repaid. Because of the cascading effect between classes, this arrangement is often referred to as a cash flow waterfall. In the event that the underlying asset pool becomes insufficient to make payments on the securities (e.g. when loans default within a portfolio of loan claims), the loss is absorbed first by the subordinated tranches, and the upper-level tranches remain unaffected until the losses exceed the entire amount of the subordinated tranches. The senior securities are typically AAA rated, signifying a lower risk, while the lower-credit quality subordinated classes receive a lower credit rating, signifying a higher risk.

In more mathematical terms, securitization basically works as follows: take your favorite set of random variables (for the sake of simplicity say binary ones) and consider the joint distribution of these variables (pooling). In a next step determine percentiles of the joint distribution (of default, i.e. 0) that you sell of separately (tranching). The magic happens via the law of large numbers and the central limit theorem (and variants of it): although each variable can have a high probability of default, the probability that more than, say x% of those default at the same time decreases (almost) exponentially. Thus the resulting x-percentile can have a low probability of default already for small x. That is the magic behind securitization which is called credit enhancement.

So given that this process of risk mitigation and tailoring of risks to the risk appetite of potential investors is rather versatile, why not applying the same concept to other cash flows that bear a certain risk of default and turn them into structured products 😉

(a) Rents: Landlords face the problem that the tenant’s credit quality is basically unknown. Often, a statement about the tenant’s income and liabilities should help to better estimate the risk of default. But this procedure can, at best, serve as an indicator. So why not using the same process to securitize the rent cash flows and sell the corresponding tranches back to the landlords. This would have several upsides. First of all, the landlord obtains a significantly more stable cash flow and depending on the risk appetite could even invest in the more subordinated tranches. This could potentially reduce rents as the risk premium charged by the landlord due to his/her potentially risk averse preference could be reduced to the risk neutral amount (plus some spreads, e.g., operational and structuring costs). The probability of default could be significantly easier estimated for the pooled rent cash flows as due to diversification it is well approximated by the expected value (maybe categorized into subclasses according to credit ratings). Of course, one would have to deal with problems such as adverse selection and the potentially hard task to estimate the correlation – which can have a severe impact on the value of the tranches (see my post here).

(b) Sport bets: Often these bets as random variables have a high probability of default, e.g., roughly 50% for a balanced win/loss bet). In order to reduce the risk due to diversification a rather large amount of cash has to be invested to obtain a reasonable risk profile. Again, securitizing those cash flows could create securities with more tailored risk profiles that could be of interest to people that are rather risk averse on the one hand and risk affine gamblers on the other hand.

(c) …

That is the wonderful world of structured finance 😉

Written by Sebastian

December 30, 2009 at 2:35 pm

Heading off to AFBC 2009

with one comment

I am on my way to the 22nd Australasian Finance and Banking Conference 2009 in Sydney. So, what the hell, is a mathematician doing on a finance conference? Well, basically mathematics and in particular optimization and operations research. I am thrilled to see the current developments in economics and finance that take computational aspects, which ultimately limit the amount of rationality that we can get, into account (I wrote about this before here, here, and here). In fact, I am convinced that these aspects will play an important role in the future, especially for structured products. After all, who is going to buy a structure where it is impossible to compute the value? Not even to talk about other complications such as bad data or dangerous model assumptions (such as static volatilities and correlations which are still used today!). Most valuation problems though can be cast as optimization problems and especially the more complex structured products (e.g., mean variance optimizer) do explicitly ask for a solution to an optimization problem in order to be valuated. For the easier structures, Monte Carlo based approaches (or bi-/trinomial trees) are sufficient for pricing. As Arora, Barak, Brunnermeier, and Ge show in their latest paper, for more complex structures (e.g., CDOs) these approaches might fall short capturing the real value of the structures, due to e.g., deliberate tampering.

I am not going to talk about aspect of computational resources though: I will be talking about my paper “Optimal Centralization of Liquidity Management” which is joined work with Christian Schmaltz from the Frankfurt School of Finance and Management. The problem that we are considering is basically a facility location problem: In a large banking network, where and how do you manage liquidity? In a centralized liquidity hub or rather in smaller liquidity centers spread all over the network. Being short on liquidity is a very expensive matter, either one has to borrow money via the interbank market (which is usually dried up or at least tight in tougher economical conditions) or one has to borrow via the central bank. If both is not available, the bank goes into a liquidity default. The important aspect here is that the decision on the location and the amount of liquidity produced, is driven to a large extent by the liquidity demand volatility. In this sense a liquidity center turns into an option on cheap liquidity and in fact, the value of a liquidity center can be actually captured in an option framework. The value of the liquidity center is the price of the exact demand information – the more volatility we have, the higher this price will be and the more we save when we have this information in advance. The derived liquidity center location problem implicitly computes the prices of the options which arise as marginal costs in the optimization model. Here are the slides:

View this document on Scribd

Written by Sebastian

December 13, 2009 at 12:17 pm

Sketch your perfect picture

leave a comment »

Check this out:

We present a system that composes a realistic picture from a simple
freehand sketch annotated with text labels. The composed picture
is generated by seamlessly stitching several photographs in agreement
with the sketch and text labels; these are found by searching
the Internet. Although online image search generates many inappropriate
results, our system is able to automatically select suitable
photographs to generate a high quality composition, using a filtering
scheme to exclude undesirable images. We also provide a novel
image blending algorithm to allow seamless image composition.
Each blending result is given a numeric score, allowing us to find
an optimal combination of discovered images. Experimental results
show the method is very successful; we also evaluate our system using
the results from two user studies.

Here is the video:

And the link to the paper — they actually use dynamic programming for parts of their algorithm.

Written by Sebastian

October 6, 2009 at 9:51 pm

Mario AI Competition 2009

leave a comment »

Yesterday I learned about the Mario AI Competition 2009 (thx for the pointer). The goal of the competition is to provide an agent that automatically navigates through a version of the Mario game:

This competition is about learning, or otherwise developing, the best controller (agent) for a version of Super Mario Bros.

The controller’s job is to win as many levels (of increasing difficulty) as possible. Each time step (24 per second in simualated time) the controller has to decide what action to take (left, right, jump etc) in response to the environment around Mario.

We are basing the competition on a heavily modified version of the Infinite Mario Bros game by Markus Persson. That game is an all-Java tribute to the Nintendo’s seminal Super Mario Bros game, with the added benefit of endless random level generation. We believe that playing this game well is a challenge worthy of the best players, the best programmers and the best learning algorithms alike.

Sounds like a perfect application for some optimization and in fact Robin Baumgarten programmed an agent partly based on the A* algorithm. Check out the video below to see how fast the agent is actually moving through the levels:

A comment on the youtube video: “So I undestand correctly: da computer will play video games for us, so we have more free time? Way cool.”

Written by Sebastian

August 14, 2009 at 10:34 pm