Discovering the relationship of the G20 members using Data Mining

It takes just a little talk with me to know that I'm a fan of the financial market and many subjects related to economics. Today I want to show an application involving news on the web, Python, MongoDB and the Gephi, a software for visualization and manipulation of social networks.

Our goal is to verify that the amount of joint occurrences of G20 countries (specifically Brazil) published in the news related to the financial market may reflect the data of the Brazilian Ministry of Development, Industry and Foreign Trade. For those who are unfamiliar with the term, the G20 (20 major economies) is a group consisting of ministers of economy and central bank presidents of 19 major economies plus the European Union. The group was established in 1999 in the wake of various economic crises of the 1990s, and is a kind of forum for cooperation and consultation on international financial matters. Member countries are (in alphabetical order):

Argentina, Australia, Brazil, Canada, China, France, Germany, India, Indonesia, Italy, Japan, Republic of Korea, Mexico, Russia, Saudi Arabia, South Africa, Turkey, the United Kingdom, the United States and the European Union.

In the table below we have the top 30 destinations for Brazilian exports based on data from the Brazilian Ministry of Development, Industry and Foreign Trade, referring to February 2014.

Lets go to the data. With an RSS Reader developed in Python and MongoDB, I analyzed the content of 1,000 news (arbitrary number) to find if any of the G20 countries had their names mentioned together in the same news. There is an excellent video published on TED - "Who Controls the World?" - which shows in a good detail level the definition of what is a complex network. Thus, we insert the data collected into Gephi to create our network of relationships between countries. As a result, we had the following mapped network showing the relationship (and its intensity, according to the thickness of the edges) of the G20 countries in our context.

The names of the countries are in Portuguese because the news was too. As our goal was to have Brazil in the spotlight and the news sources were all brazilian, we have the country as the main node of our network. We can check, by the thickness of the edges, that Brazil is strongly related to China, USA, Japan and Germany. If we look at the table of export destination again, we see that these countries are, respectively, the 1st, 2nd, 5th and 6th destinations of our exports.

A more careful analysis shows that Brazil, in the news from our database, is related to almost all countries, except for Saudi Arabia and England (but which somehow has a relation represented by UK). I.e, of the top 30 destinations of our exports, only Saudi Arabia was not related. We show that conclusion in the next graph, highlighting the relations of Brazil (England and Saudi Arabia are more distant and with a smoothed color).

Despite having the national news as a primary data source, we can see the influence of the United States, the largest economy in the world, in your relationship with the other countries (except Saudi Arabia, but maybe by the amount of our data).

Given all these data, we conclude that the present relations in economic news actually reflect the data from our commercial relations. Maybe it was not different, but it is a way to show how everything is connected and in fact, given that markets are efficient (there is much discussion here and I tend to disagree with the theory), we have that trade relations will be reflected in some way in the behavior of market players and consequently,will be reflected upon pricing of financial assets.

Views: 2527

Tags: analytics, data, mining, science


You need to be a member of Data Science Central to add comments!

Join Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service