According to the MAMA system, the average web page has 47 markup errors and some 80 per cent of sites use CSS.
Opera Software is creating a search engine to track the structure of the web, in the hopes of helping browser developers and standards bodies make it a better place.
The Metadata Analysis and Mining Application (MAMA) doesn’t index content like a standard search engine, but looks at markup, style, scripting and the technology behind pages.
This will let developers ask questions such as “can I get a sampling of web pages that have more than 100 hyperlinks?” and get an answer based on 3.5 million pages, Opera said.
Based on those existing MAMA-ed pages, 80.4 per cent of sites use cascading style sheets (CSS), while the average web page has 47 markup errors and 16,400 characters. Should you want to know which country is using the AJAX component XMLHttpRequest the most, MAMA can tell you that it’s Norway, with 10.2 per cent of the data set.
Opera hopes such information will help web developers to influence standards bodies by showing the reality of what is happening online, thereby improving the interoperability of browsers. “The web is fragmented, complex and always evolving. MAMA’s vast database provides us with detailed information about how Web technologies are used,” said Snorre M. Grimsby, vice president of quality assurance at Opera.
MAMA will be made public in the next few months, Opera said. "This is key in our efforts to test and ensure high-quality compatibility, stability and performance of our products, and we want to share it with our peers, so they can benefit from it too," Grimsby said.