SATOCONOR.COM
Johan
van der Galiën ’Efficient Market Hypothesis and the
Full paper
SATOCONOR.COM Journal of RANDOMICS
Efficient Market Hypothesis and the
This article presents EMH software that can determine the strong, semi-strong and weak efficient parts of the NYSE
By Johan G. van der Galiën
For comments: johan.van.der.galien@satoconor.com
Version 1.0
Submitted: December 10, 2007
Revisions: April 28, 2008
Accepted: April 30, 2008
EMH software is presented in this paper. The main coefficient the software provides is the Efficiency Index, which correlates very well with Signal-to-Noise and Shannon Entropy of the historical prices, over a period of six years, of 100 stocks + Dow Jones Index. Against the Efficiency Index, Auto-Correlation is constant (around 0.9). Since 101 items are representative sample size for the whole NYSE, two tailed hypothesis testing revealed that 13% of the market is weak efficient, 80% semi-strong and 7% strong. So only 7% is like a completely random process. The EMH software can provide the 13% weak efficient stocks, out of 3554 registered at NYSE that can successfully be bought and sold with excess profits on basis of fundamental analysis portfolio management.
1. Introduction
There are excellent articles on the Efficient Market Hypothesis (EMH) in Wikipedia, so I will not go into detail about that. Also I do not want to tell you too much about my trade secrets, so you have to do it with this brief summary. From the historical prices of the analyzed stocks an Efficiency Index (EI) is calculated by my EMH computer program. H0, H1a and H1b hypotheses are tested two tailed against α Significance Level. The EMH program calculates also the power-of-the-test.
H0-hypothesis = Stock is semi-strong efficient because α ≤ EI ≤1-α
H1a-hypothesis = Stock is weak efficient because EI < α
H1b-hypothesis = Stock is strong efficient because EI > 1-α
|
|
DECISION |
|
|
|
NEGATIVE |
POSITIVE |
|
ACTUAL SITUATION |
H0 Accepted (Suspect EI falls in criterion) |
H1 Accepted (Suspect EI outside criterion) |
|
Natural (H0 is true) NEGATIVE |
No Error |
Type I Error (false negative = α probability) |
|
Manipulated (H1 is true) POSITIVE |
Type II Error (false positive = β probability) |
No Error |
Table 1: Hypothesis testing and the definitions of the α- and β-errors.

Fig. 1: Two distributions and the two hypotheses and the power-of-the-test, α- and β-errors. For simplicity sake here shown is one tailed testing.
2. Materials and Methods
2.1. General remarks
Software was developed according to the Waterfall Model of which a short summary follows in this paragraph. 1530 daily quotes, from the period 11/8/2001 – 12/7/2007, of 100 (+Dow Jones Index) out of 3554 stocks registered on 12/3/2007 at the New York Stock Exchange (Wall Street), picked at random by the a-select function of EXCEL, were downloaded from the internet and used in the efficiency analysis functionality of the software. The software is still in the testing phase of version 3.1, which is meant as the demo program for version 4.0.
2.2. Waterfall Model of the software development
2.2.1. Domain analysis
Not public information.
2.2.2. Software elements Analysis
Functional Requirements:
1. Windows desktop application
2. Efficiency analysis
· Calibration process
· Detection of positives and negatives of the semi-strong distribution
3. Power analysis
4. Probability false positives
5. Probability false negatives
6. Reporting to client from database tables
2.2.3. Specification
· VB.NET Windows application
· Build with Visual Basic Developer 2008 Express
· MS SQL Server 2005 Express database to store the results, from which is reported to the GUI by means of SQL queries
· Reference to free web services to pull in daily or maybe even intra day trading quotes
· Real-time monitoring with optimized historical price dynamical moving time window
2.2.4. Software architecture
Database Design:
1. INPUT data table
2. REPORT data table
Architecture:

2.1.5. Implementation
Not public information
2.1.6. Testing
The purpose of the software is a commercial aid for fundamental analysis of the stock market. Testing is done on a sample of 100 randomly picked NYSE stocks. This paper can be regarded as the public testing report.
2.1.7. Documentation
This paper is public documentation.
8. Software training and support
Can be provided on request.
9. Maintenance
Under heavy usage where intra-day trading is pulled in by reference of a web service the database could grow bigger than 4 Gigabyte of MS SQL Server 2005 Express. The only options are a bigger database and the purchase of a license for the full version or schedule stored procedures that delete obsolete rows that fall outside the moving time window.
3. Results

Graph 1: EI versus the Signal-to-Noise Ratio of the normal distribution fit of the data. F-Significance = 0.04538

Graph 2: EI versus First Order Serial Correlation (Auto-Correlation) of the normal distribution fit of the data. F-significance = 0.459271

Graph 3: EI versus Shannon Entropy of the data distribution. F-Significance = 0.015047
There are 7 strong, 13 weak and 80 semi-strong efficient out of the 100 (+ Dow Jones Index) stocks with α = 0.01 and calculated β = 0 always (so always the power-of-the-test = 1). According to several calculations these 101 items are a representative sample size for the 3554 stocks registered on NYSE.
4. Discussion
From Graph 1 you can see that there is a good linear correlation between Signal-to-Noise (SNR = µ/σ) and Efficiency Index (EI), F-Significance is within a reasonable α Significance Level (< 0.05). I personally believe that SNR is the best measure of the volatility of a stock in question. If this is true EI also incorporates volatility. But there are more: EI correlates even better with the Shannon Entropy of the data distribution (Graph 3). Shannon Entropy is the best measure for the content of information in a system. Maximal Shannon Entropy means maximal information means completely random process. There is no linear regression with Auto-Correlation of the normal distribution. The line is almost parallel to the x-axis and is more some kind of constant mean value through all the data points (Graph 2). There are also other reasons, besides the good correlation of EI with SNR and Entropy, and I do not provide them because I would reveal my trade secrets, to believe that deviations from H0-hypothesis two tailed testing divide the NYSE in the weak, semi-strong and strong efficiency parts, and because 100 stocks + Dow Jones Index are a representative sample it also gives the percentages of the whole Wall Street stock market. Of course I can provide the ticker symbols or the names of the individual stocks in for lets say the weak efficiency part, which could then be bought and sold with an expectation of excess profits based on fundamental analysis portfolio management, but this is not a free service. What I do in this paper is a new kind of technical analysis meant to be a part of the more general fundamental analysis.
5. Conclusions
· All EMH stocks have an Auto-Correlation of around 0.9.
· EI of a stock incorporates the SNR and Shannon Entropy measures.
· EI says 13% of the NYSE stocks are marketed weak efficient, 80% is semi-strong and 7% is strong.
· So only 7% of the NYSE stocks are bought and sold in a completely random fashion.
· Stock markets must be constantly monitored with an optimized time window for historical prices because I am convinced that individual stocks can shift in efficiency category.
· My EMH software can provide the 13% weak efficient stocks, out of 3554 registered at NYSE, that can successfully be subjected to fundamental analysis based portfolio management.
Acknowledgement
Many thanks are due to MICROSOFT for providing the Visual Basic
Developer 2008 Express and the SQL Server 2005 Express editions as freeware.
Also freeware and an excellent software design: DATAPLOT from the National
Institute of Standards and Technology (NIST) in the
-o0o-