How to compress WURFL in 1000 times less?

If you create a site for mobile devices, you need to solve the problem: how to distinguish the user, who has visited a site from a computer, from the user, who has visited a site from the mobile device? There are some scripts, which allow solving this problem:

But mobile devices strongly differ by the opportunities. Some devices support only WML, some—mobile XHTML, some (iMode)—cHTML. Moreover, there are some PDAs which fine displays usual HTML. Here the situation is more complex, and exists not so many scripts allowing to distinguish devices on these four groups.

However, there is one more question: what is the “error” of these scripts? Unfortunately, you will not find the answer to this question. But that the error is, you cannot doubt. How to reduce this error to a minimum?

Library WURFL comes to the rescue. It allows determining practically all characteristics of the device by “USER_AGENT” string (which it is passed by the device to a server and identifies a browser and OS). In particular, for each device a property “preferred_markup” contains a markup language, which is in the best way reproduced by the mobile device:

  • ‘wml_1_1′, ‘wml_1_2′, ‘wml_1_3′—WML;
  • ‘html_wi_imode_compact_generic’, ‘html_wi_imode_html_1′, ‘html_wi_imode_html_2′, ‘html_wi_imode_html_3′, ‘html_wi_imode_html_4′, ‘html_wi_imode_html_5′—IMODE;
  • ‘html_wi_oma_xhtmlmp_1_0′, ‘html_wi_w3_xhtmlbasic’—XHTML;
  • ‘html_web_3_2′, ‘html_web_4_0′—HTML.

But it is necessary “to pay” for this universality: this library uses a database stored in XML format in a file which size exceeds 6Mb. Naturally, the analysis of such file takes much time (or at use of caching it is required a lot of disk space).

Having spent a fair amount of time for the analysis of a database wurfl.xml, I have written a simple script, which divides mobile devices by supported markup language (WML/IMODE/XHTML/HTML) just as it would be made by means of library WURFL:

<?php
function CheckMobile($ua)
{
$m=array(
'|^DoCoMo|'=>'IMODE'
,'|SG345i|'=>'WML'
,'|K610im|'=>'XHTML'
,'|N411i|'=>'XHTML'
,'|NK601i|'=>'XHTML'
,'|^portalmmm|'=>'IMODE'
,'|^o2imode|'=>'IMODE'
,'|^LG-C3100|'=>'IMODE'
,'|^TSM-7/53118000|'=>'IMODE'
,'|^KGT|'=>'IMODE'
,'|^Qtek S200|'=>'HTML'
,'|^SIE-C3I|'=>'XHTML'
,'|^LGE-VX4|'=>'XHTML'
,'|up\.browser/4|i'=>'WML'
,'|^BlackBerry/2\.5|'=>'WML'
,'|blackberry|i'=>'XHTML'
,'|^AUDIOVOX-CDM9500|'=>'XHTML'
,'|^SIE-A76/34|'=>'WML'
,'|^SIE-A71/19|'=>'WML'
,'|^SPH-A880|'=>'WML'
,'|^SAMSUNG-SGH-T319|'=>'WML'
,'|^SAGEM-myX-5m|'=>'WML'
,'|^T618X|'=>'WML'
,'|up\.browser/6|i'=>'XHTML'
,'|^SharpT71|'=>'WML'
,'|^SharpWXT71|'=>'WML'
,'|^Vodafone/1.0/703SH|'=>'WML'
,'|^Vodafone/703SH|'=>'WML'
,'|up\.browser/7|i'=>'XHTML'
,'|^sharp|i'=>'XHTML'
,'|^TSM|'=>'WML'
,'|^MOT-ROKR|'=>'XHTML'
,'|^Mozilla/5\.0 \(1UP|'=>'HTML'
,'|up|i'=>'WML'
,'|^Nokia6708|'=>'WML'
,'|^Nokia7650|'=>'WML'
,'|Nokia6290|'=>'HTML'
,'|Nokia 7650|'=>'HTML'
,'| N-Gage|'=>'HTML'
,'|Symbian|'=>'XHTML'
,'|^Palm|'=>'XHTML'
,'|^SAMSUNG-SGH-I600|'=>'XHTML'
,'|smartphone|i'=>'HTML'
,'|IEMobile|'=>'XHTML'
,'|^YourWap Siemens C35i|'=>'XHTML'
,'|^YourWap|'=>'WML'
,'|^Motorola-C290|'=>'WML'
,'|^Motorola-MPX200|'=>'HTML'
,'|motorola|'=>'XHTML'
,'|^Nokia11|'=>'WML'
,'|^Nokia2652|'=>'WML'
,'|^Nokia3300|'=>'XHTML'
,'|^Nokia33|'=>'WML'
,'|^Nokia3410|'=>'WML'
,'|^Nokia3510|'=>'WML'
,'|^Nokia3560|'=>'WML'
,'|^Nokia3610|'=>'WML'
,'|^Nokia5100/1.0|'=>'WML'
,'|^Nokia5210|'=>'WML'
,'|^Nokia5510|'=>'WML'
,'|^Nokia5555|'=>'WML'
,'|^Nokia6100|'=>'WML'
,'|^Nokia6108|'=>'WML'
,'|^Nokia6125|'=>'WML'
,'|^Nokia6210|'=>'WML'
,'|^Nokia6250|'=>'WML'
,'|^Nokia630|'=>'XHTML'
,'|^Nokia63|'=>'WML'
,'|^Nokia6510|'=>'WML'
,'|^Nokia6560|'=>'WML'
,'|^nokia6610i|i'=>'XHTML'
,'|^Nokia6610|'=>'WML'
,'|^Nokia6800/1.0|'=>'WML'
,'|^Nokia71|'=>'WML'
,'|^Nokia7210|'=>'WML'
,'|^Nokia7250I|'=>'XHTML'
,'|^Nokia7250|'=>'WML'
,'|^Nokia8600|'=>'XHTML'
,'|^Nokia8800|'=>'XHTML'
,'|^Nokia8|'=>'WML'
,'|^Nokia9|'=>'XHTML'
,'|^Nokia N70|'=>'XHTML'
,'|^Nokia |'=>'WML'
,'|^NOKIA-6256i|'=>'XHTML'
,'|^NOKIA-|'=>'WML'
,'|^Nokia/3587|'=>'WML'
,'|Nokia 8310|'=>'WML'
,'|Series60\/2\.0 Nokia6600|'=>'HTML'
,'|nokia|i'=>'XHTML'
,'|^6310i|'=>'WML'
,'|^6590|'=>'XHTML'
,'|^SendoX/1\.0|'=>'XHTML'
,'|^Sendo|'=>'WML'
,'|^SIE-IC35|'=>'WML'
,'|^SIE-CL50|'=>'WML'
,'|^SIE-SF65v2|'=>'XHTML'
,'|^SIE-SF65|'=>'WML'
,'|^SIE-SX66|'=>'HTML'
,'|^SIE|'=>'XHTML'
,'|^SL45i|'=>'WML'
,'|^M50|'=>'WML'
,'|^MT50|'=>'WML'
,'|^C55|'=>'WML'
,'|^S55|'=>'XHTML'
,'|^Vodafone/V902T|'=>'WML'
,'|^Vodafone/1\.0/HTC_Mercury|'=>'HTML'
,'|Vodafone|'=>'XHTML'
,'|PDA|'=>'XHTML'
,'|Cellphone|'=>'WML'
,'|iPhone|'=>'XHTML'
,'|jBrowser|'=>'WML'
,'|NW.Browser|'=>'WML'
,'|AvantGo|'=>'XHTML'
,'|blazer|i'=>'XHTML'
,'|^Mitsu|'=>'WML'
,'|NEC|'=>'XHTML'
,'|^SonyEricssonP900|'=>'XHTML'
,'|^MERIDIAN-Z400|'=>'XHTML'
,'|rev|i'=>'WML'
,'|^AU-MIC|'=>'WML'
,'|^SEC-SGHD800|'=>'WML'
,'|^Mozilla/4\.0 NetFront/3|'=>'HTML'
,'|NetFront|'=>'XHTML'
,'|^Alcatel-OT-C55|'=>'XHTML'
,'|^Alcatel-OT-C701|'=>'XHTML'
,'|^Alcatel|'=>'WML'
,'|MIB/1|'=>'WML'
,'|^MOT-A820|'=>'WML'
,'|^MOT-A|'=>'XHTML'
,'|^MOT-E390|'=>'WML'
,'|^MOT-E825|'=>'WML'
,'|^MOT-E|'=>'XHTML'
,'|^MOT-V500|'=>'XHTML'
,'|^MOT-T191|'=>'WML'
,'|MIB/2.2|i'=>'XHTML'
,'|BER|'=>'XHTML'
,'|Motorola|'=>'XHTML'
,'|^MOT-C168|'=>'XHTML'
,'|^MOT-T725|'=>'XHTML'
,'|^MOT-V180|'=>'XHTML'
,'|^MOT-V220|'=>'XHTML'
,'|^MOT-V3|'=>'XHTML'
,'|^MOT-V55|'=>'XHTML'
,'|^MOT-V66|'=>'WML'
,'|^MOT-V69|'=>'WML'
,'|^MOT-V6|'=>'XHTML'
,'|^MOT-V860|'=>'XHTML'
,'|^MOT-W220|'=>'XHTML'
,'|^MOT-MPX200|'=>'HTML'
,'|^MOT-|'=>'WML'
,'|^MotV600|'=>'XHTML'
,'|^SAGEM-myW|'=>'WML'
,'|^SAGEM-myX-3|'=>'WML'
,'|^SAGEM-myX-5|'=>'WML'
,'|^SAGEM|'=>'XHTML'
,'|WAP1|'=>'WML'
,'|^LG-C3380|'=>'WML'
,'|^LG-KG240|'=>'WML'
,'|^LG-F2250|'=>'WML'
,'|^DALLAB|'=>'HTML'
,'|WAP2|'=>'XHTML'
,'|^PHILIPS|'=>'WML'
,'|^PT-|'=>'WML'
,'|^PG-3000|'=>'XHTML'
,'|^PG-3210|'=>'XHTML'
,'|^PG-3500|'=>'XHTML'
,'|PG-6100|'=>'XHTML'
,'|PG|'=>'WML'
,'|^SEC-SGHC|'=>'WML'
,'|^SEC-SGHD41|'=>'WML'
,'|^SEC-SGHD5|'=>'XHTML'
,'|^SEC-SGHE320|'=>'WML'
,'|^SEC-SGHE71|'=>'WML'
,'|^SEC-SGHE72|'=>'WML'
,'|^SEC-SGHE81|'=>'WML'
,'|^SEC-SGHE88|'=>'WML'
,'|^SEC-SGHE|'=>'XHTML'
,'|^SEC-SGHP400|'=>'XHTML'
,'|^SEC-SGHP716|'=>'XHTML'
,'|^SEC-SGHP|'=>'WML'
,'|^SEC-SGHX16|'=>'WML'
,'|^SEC-SGHX20|'=>'WML'
,'|^SEC-SGHX480|'=>'XHTML'
,'|^SEC-SGHX4|'=>'WML'
,'|^SEC-SGHX510|'=>'WML'
,'|^SEC-SGHX610|'=>'WML'
,'|^SEC-SGHX|'=>'XHTML'
,'|^SEC-SGHS|'=>'XHTML'
,'|^SEC-SGHI700|'=>'HTML'
,'|^SEC-SGH|'=>'WML'
,'|^SAMSUNG-SGH-C|'=>'WML'
,'|^SAMSUNG-SGH-E320|'=>'WML'
,'|^Samsung-SGH-T|i'=>'WML'
,'|^SAMSUNG-SGH-X300|'=>'WML'
,'|^SAMSUNG-SGH-X481|'=>'WML'
,'|SGH|'=>'XHTML'
,'|^Samsung-SPHA8|'=>'WML'
,'|^Samsung_MITs|'=>'WML'
,'|^SAMSUNG-Sva|'=>'WML'
,'|^FAKE_Samsung|'=>'WML'
,'|Samsung|'=>'XHTML'
,'|Samsu3G|'=>'XHTML'
,'|^Ericsson|'=>'WML'
,'|TelecaBrowser|'=>'WML'
,'|^SonyEricssonJ2|'=>'WML'
,'|^SonyEricssonT226|'=>'XHTML'
,'|^SonyEricssonT290|'=>'XHTML'
,'|^SonyEricssonT620|'=>'XHTML'
,'|^SonyEricssonT687i|'=>'XHTML'
,'|^SonyEricssonT|'=>'WML'
,'|T610|'=>'WML'
,'|^SonyEricssonZ2|'=>'WML'
,'|SonyEricsson|'=>'XHTML'
,'|^P800|'=>'XHTML'
,'|^Panasonic-SA|'=>'XHTML'
,'|^Panasonic-VS3|'=>'XHTML'
,'|^Panasonic-VS7|'=>'XHTML'
,'|^Panasonic-X60|'=>'XHTML'
,'|^Panasonic-X400|'=>'XHTML'
,'|^Panasonic-X700|'=>'XHTML'
,'|^Panasonic|'=>'WML'
,'|^LGE-CU8280|'=>'WML'
,'|^LG-G650|'=>'WML'
,'|^LG-G1600|'=>'WML'
,'|^LG-G4010|'=>'WML'
,'|^LG-G5220|'=>'WML'
,'|^LG-G5400|'=>'WML'
,'|^LG5400|'=>'WML'
,'|^LG-G7100|'=>'WML'
,'|^LG-G7110|'=>'WML'
,'|^LG G8000|'=>'WML'
,'|^LG|'=>'XHTML'
,'|^Grundig GR|'=>'HTML'
,'|^GRUNDIG|'=>'WML'
,'|^Device/TaiChiPlus|'=>'WML'
,'|^R380|'=>'WML'
,'|^BenQ P30|'=>'WML'
,'|^ASUS-J100|'=>'XHTML'
,'|^ASUS-J1|'=>'WML'
,'|^TIM|'=>'WML'
,'|^MEDION|'=>'WML'
,'|^Maxon|'=>'WML'
,'|^GF-500|'=>'WML'
,'|^LYNX/R01|'=>'WML'
,'|^Toplux|'=>'WML'
,'|^VK520|'=>'WML'
,'|^WAPPER|'=>'WML'
,'|Profile/MIDP|i'=>'XHTML'
,'|MMP/1\.0|'=>'XHTML'
,'|iPod|'=>'XHTML'
,'|^G83|'=>'XHTML'
,'|^MTV|'=>'XHTML'
,'|^Hutc3G/1|'=>'XHTML'
,'|TS705|'=>'XHTML'
,'|^Amoi|'=>'XHTML'
,'|^Haier-M1000|'=>'XHTML'
,'|^FLY-SL600|'=>'XHTML'
,'|^Mozilla/4\.0 \(compatible; MSIE 6\.0; Windows 95\) 16;160x160|'=>'XHTML');
foreach($m as $r=>$ml) if(preg_match($r,$ua)) return $ml;
return 'HTML';
}
?>

For experts: the script correctly processes all browsers, which are located below “actual_device_root”, as well as all browsers from web_browsers_patch.xml.

By the way, the size of this script is in 1000 times less than size of a file wurfl.xml!

However, I want to note some disadvantages of my approach:

  1. Creation of such script is handwork; therefore, it will be difficult for modifying if essential additions will be brought in library WURFL.
  2. The sequence of 258 regular expressions is used for defining of markup language. It can be executed not so quickly, as it would be desirable. So, the question remains open: what will work faster (this script or WURFL with multicache)?
  3. Everything, library WURFL gives the much more information, than my script (for example, the screen resolution of the mobile device, etc.).

DOWNLOAD this script

Related Posts

You can follow any responses to this entry through the RSS 2.0 feed.

Leave a Reply