Select your font size 
 
about us products & services consulting & support news & events contact us
Paul Meagher shows how to use a database query to calculate conditional probability.

Conditional Probability and SQL - North Carolina

print this article 
 

P(A | B) can be mapped onto database-query operations. For example, the probability of cancer given a positive test result, P(+cancer | +test), can be obtained by issuing this SQL query then doing some tallies on the result set like this:

SELECT cancer_status FROM Data WHERE test_status='+test'

If I gather information about how several boolean-valued tests co-vary with a boolean-valued diagnosis (like that of cancer or not cancer), then I can perform slightly more complex queries to study how diagnostically useful other factors are in determining whether a patient has cancer, such as in the following:

SELECT cancer_status FROM Data WHERE genetic_status='+' AND age_status='+' AND biopsy_status='+'

In the case of detecting e-mail spam, I might be interested in computing P(+spam | title_word='viagra' AND title_word='free'), which could be viewed as a directive to issue the following SQL query:

SELECT spam_status FROM Emails WHERE email_title LIKE 'viagra' AND email_title LIKE 'free' 

After enumerating the number of e-mails that are spam and have "viagra" and "free" in the title (like so):

count_emails(spam_status='+spam' AND email_title LIKE 'viagra' AND email_title LIKE 'free')

and dividing by the overall number of e-mails with the words "viagra" and "free" in the title:

count_emails(email_title LIKE 'viagra' AND email_title LIKE 'free')

I might arrive at the conclusion that the appearence of these words in the title strongly and specifically co-varies with the message being spam (after all, 18/18 = 100 percent) and this rule might be used to automatically filter such messages.

In Bayes spam filtering, you need to initially train the software in which e-mails are spam and which are not. One can imagine storing spam_status information with each e-mail record (for example, email_id, spam_status, email_title, or email_message) and doing the previous queries and counts on this data to decide whether to forward a new e-mail into your inbox.



Page:   1  2  3  4  5  6  7  8  9  10  11 Next Page: Frequency versus probability format

The content shown in this page was first published by IBM developerWorks and is reprinted with permission from Paul Meagher (www.datavore.com)


Most Recent Website and Regional Updates

 Transparen Toronto Office Locations
Addresses of Transparen Corporation offices in Toronto, Ontario.

 
 High Scalability - Large Systems Optimization
Transparen Corporation lends its expertise to clients experiencing rapid and sudden growth in traffic or server utilization, bottlenecks, systems instability, downtime during peak traffic, or which would like to plan to avoid such issues.

 
 Throughput (or Bandwidth) vs. Latency
This document uses the example of Bill Gates purchasing Google to explain the difference between bandwidth (or throughput) and latency.

 
 Emergency Management Services
The prototypical emergency involves a shutdown of essential services for a finite period of time. What will your organization do when a world-wide financial crisis strikes?

 
 Fast RAID Server Data Recovery Service
Transparen's Vancouver International Response Team provides the option in Canada and USA to get a raid server back running in hours - eliminating costly waiting associated with typical RAID recoveries.

 
 Data Recovery Service
Have you deleted a mission critical file? Accidentally dropped a computer, or formatted a hard drive? No recent backup? Mistakes can happen, but the data might still be there.

 
 About Transparen
Transparen is committed to serving its clients.

 
 Research Tools
Measure human resource allocation and collect data with the goal of determining patterns that will bring forward actionable insights which may lead to policy changes, saving money and improving quality of service.

 

Google
 
Web transparen.com

Contact Information

Related Information

 
   
 
E C M | © 2003-2007 Transparen Corp.      

Standardized Services: Data Recovery Service / Creative Services / Premium Web Hosting Services / System Administration Tech Support Services
Recent Projects: Full-Service Mortgage and Financing Company / System to manage flights from Vancouver to Tofino / Photo exchange verification service
Our Vancouver BC Server Proudly Hosts: automated parking and revenue control systems, leafside lane at southlands, cost effective alternative power sources, Higher Grade Learning Centres, pacific forage bag supply, sunburst medical, neosonic design, roger mahler photography - passionate, intriguing, desirable, the connection between east and west, affordable flights to victoria and tofino, low interest mortgage brokers in vancouver, richmond, surrey, toronto, Toronto Calgary and Vancouver IT staffing and talent search
* Aberdeen -- town, Moore County * Ahoskie -- town, Hertford County * Alamance -- village, Alamance County * Albemarle -- city, Stanly County * Alexander Mills * Alliance -- town, Pamlico County * Andrews -- town, Cherokee County * Angier -- town, Harnett County * Ansonville -- town, Anson County * Apex -- town, Wake County * Arapahoe -- town, Pamlico County * Archdale -- city, Randolph County * Arlington -- town, Yadkin County * Asheboro -- city, Randolph County * Askewville -- town, Bertie County * Atkinson -- town, Pender County * Atlantic Beach -- town, Carteret County * Aulander -- town, Bertie County * Aurora -- town, Beaufort County * Autryville -- town, Sampson County * Autryville -- town, Dare County * Ayden -- town, Pitt County * Badin -- town, Stanly County * Bailey -- town, Nash County * Baskerville * Bald Head Island -- village, Brunswick County * Banner Elk -- town, Avery County * Bath -- town, Beaufort County * Bayboro -- town, Pamlico County * Beargrass -- town, Martin County * Beaufort -- town, Carteret County * Beech Mountain -- town, Avery County * Belhaven -- town, Beaufort County * Belmont -- city, Gaston County * Belville -- town, Brunswick County * Belwood -- town, Cleveland County * Benson -- town, Johnston County * Bessemer City -- city, Gaston County * Bethania -- town, Forsyth County * Bethel -- town, Pitt County * Beulaville -- town, Duplin County * Biltmore Forest -- town, Buncombe County * Biscoe -- town, Montgomery County * Black Creek -- town, Wilson County * Black Mountain -- town, Buncombe County * Bladenboro -- town, Bladen County * Blowing Rock -- town, Watauga County * Boardman -- town, Columbus County * Bogue -- town, Carteret County * Boiling Spring Lakes -- city, Brunswick County * Boiling Springs -- town, Cleveland County * Bolivia -- town, Brunswick County * Bolton -- town, Columbus County * Boone -- town, Watauga County * Boonville -- town, Yadkin County * Bostic -- town, Rutherford County * Brevard -- city, Transylvania County * Bridgeton -- town, Craven County * Broadway -- town, Moore County * Brookford -- town, Catawba County * Brunswick -- town, Columbus County * Bryson City -- town, Swain County * Bunn -- town, Franklin County * Buies Creek -- unknown, Harnett County * Burgaw -- town, Pender County * Burlington -- town, Alamance County * Burnsville -- town, Yancey County * Buxton * Cajah Mountain * Calabash * Calypso * Cameron * Candor * Canton * Cape Carteret * Carolina Beach * Carrboro * Carthage * Cary * Casar * Cashiers * Castalia * Caswell Beach * Catawba * Cedar Point * Centerville * Cerro Gordo * Chadbourn * Chapel Hill * Charlotte * Cherryville * Chimney Rock * China Grove * Chocowinity * Claremont * Clarkton * Clayton * Clemmons * Cleveland * Clinton * Clyde * Coats * Cofield * Colerain * Columbia * Columbus * Como * Concord * Conetoe * Connelly Springs * Conover * Conway * Cooleemee * Cornelius * Corolla * Cove City * Cramerton * Creedmoor * Creswell * Crossnore * Dallas * Danbury * Davidson * Denton * Dillsboro * Dobson * Dortches * Dover * Drexel * Dublin * Duck * Dudley * Dunn * Durham * Earl * East Arcadia * East Bend * East Flat Rock * East Lake * East Laurinburg * Eastover * East Rockingham * East Spencer * Eden * Edenton * Edward * Elizabeth City * Elizabethtown * Elk Park * Elkin * Ellenboro * Ellerbe * Elm City * Elon * Elon College * Emerald Isle * Enfield * Engelhard * Enka * Enochville * Enon * Epsom * Ernul * Erwin * Etowah * Eureka * Everetts * Evergreen * Fair Bluff * Fairmont * Faison * Faith * Falcon * Falkland * Fallston * Farmville * Fayetteville * Flat Rock * Fletcher * Forest City * Fountain * Four Oaks * Foxfire Village * Franklin * Franklinton * Franklinville * Fremont * Frisco * Fuquay-Varina * Gamewell * Garland * Garner * Garysburg * Gaston * Gastonia * Gatesville * Gibson * Gibsonville * Glen Alpine * Godwin * Goldsboro * Graham * Grandfather * Granite Falls * Granite Quarry * Green Level * Greenevers * Greensboro * Greenville * Grifton * Grimesland * Grover * Halifax * Hamilton * Hamlet * Harkers Island * Harmony * Harrells * Harrellsville * Hassell * Hatteras * Havelock * Haw River * Hayesville * Hazelwood * Henderson * Hemby Bridge * Hendersonville * Hertford * Hickory * High Point * High Shoals * Highlands * Hildebran * Hillsborough * Hobgood * Hoffman * Holden Beach * Holly Ridge * Holly Springs * Hookerton * Hope Mills * Hot Springs * Hudson * Huntersville * Indian Beach * Indian Trail * Jackson * Jacksonville * Jamestown * Jamesville * Jefferson * Jonesville * Kannapolis * Kelford * Kenansville * Kenly * Kernersville * Kill Devil Hills * King * Kings Mountain * Kingstown * Kinston * Kittrell * Kitty Hawk * Knightdale * Kure Beach * La Grange * Lake Lure * Lake Park * Lake Santeetlah * Lake Waccamaw * Landis * Lansing * Lasker * Lattimore * Laurel Park * Laurinburg * Lawndale * Leggett * Leland * Lenoir * Lewiston Woodville * Lewisville * Lexington * Liberty * Lilesville * Lillington * Lincolnton * Linden * Littleton * Locust * Long Beach * Louisburg * Love Valley * Lowell * Lucama * Lumber Bridge * Lumberton * Macclesfield * Macon * Madison * Maggie Valley * Magnolia * Maiden * Manteo * Marietta * Marion * Marshville * Mars Hill * Marshall * Marvin * Matthews * Maxton * Mayodan * Maysville * McAdenville * McDonald * McFarlan * Mebane * Mesic * Micro * Midway * Middleburg * Middlesex * Milton * Mineral Springs * Minnesott Beach * Mint Hill * Mocksville * Momeyer * Monroe * Montreat * Mooresboro * Mooresville * Morehead City * Morganton * Morrisville * Morven * Mount Airy * Mount Gilead * Mount Holly * Mount Olive, Stokes County * Mount Olive, Wayne County * Mount Pleasant * Murfreesboro * Murphy * Nags Head * Nashville * Navassa * New Bern * New London * Newland * Newport * Newton * Newton Grove * Norlina * Norman * North Topsail Beach * North Wilkesboro * Northwest * Norwood * Oak City * Oakboro * Oak Ridge * Ocean Isle Beach * Old Fort * Oriental * Orrum * Oxford * Pantego * Parkton * Parmele * Patterson Springs * Peachland * Pembroke * Pikeville * Pilot Mountain * Pine Knoll Shores * Pine Level * Pinebluff * Pinehurst * Pinetops * Pineville * Pink Hill * Pittsboro * Plymouth * Polkton * Pollocksville * Powellsville * Princeton * Princeville * Proctorville * Raeford * Raleigh * Ramseur * Randleman * Ranlo * Raynham * Red Oak * Red Springs * Reidsville * Rennert * Rhodhiss * Rich Square * Richfield * Richlands * River Bend * Roanoke Rapids * Robbins * Robbinsville * Robersonville * Rockingham * Rockwell * Rocky Mount * Rodanthe * Rolesville * Ronda * Roper * Rose Hill * Roseboro * Rosman * Rowland * Roxboro * Roxobel * Rural Hall * Ruth * Rutherford College * Rutherfordton * Salemburg * Salisbury * Salter Path * Saluda * Salvo * Sandy Creek * Sandyfield * Sanford * Saratoga * Sawmills * Scotland Neck * Seaboard * Seagrove * Selma * Seven Devils * Seven Springs * Severn * Shallotte * Sharpsburg * Shelby * Siler City * Simpson * Sims * Smithfield * Snow Hill * Southern Pines * Southern Shores * Southport * Sparta * Speed * Spencer * Spencer Mountain * Spindale * Spring Hope * Spring Lake * Spruce Pine * St. Helena * St. Pauls * Staley * Stallings * Stanfield * Stanley * Stantonsburg * Star * Statesville * Stedman * Stem * Stokesdale * Stoneville * Stonewall * Stovall * Sugar Mountain * Summerfield * Stovall * Sugar Mountain * Summerfield * Sunset Beach * Surf City * Swan Quarter * Swansboro * Sylva * Tabor City * Tar Heel * Tarboro * Taylorsville * Taylortown * Teachey * Thomasville * Timberlake * Tobaccoville * Topsail Beach * Trent Woods * Trenton * Troutman * Troy * Tryon * Turkey * Unionville * Valdese * Vanceboro * Vandemere * Varnamtown * Vass * Waco * Wade * Wadesboro * Wagram * Wake Forest * Walkertown * Wallace * Wallburg * Walnut Cove * Walnut Creek * Walstonburg * Warrenton * Washington * Washington Park * Watha * Waves * Waxhaw * Waynesville * Weaverville * Webster * Weddington * Weldon * Wendell * Wesley Chapel * West Jefferson * Whispering Pines * Whitakers * White Lake * Whiteville * Whitsett * Wilkesboro * Williamston * Willard * Wilmington * Wilson * Wilson's Mills * Windsor * Winfall * Wingate * Winston-Salem * Winterville * Winton * Woodfin * Woodland * Wrightsville Beach * Yadkinville * Yanceyville * Yaupon Beach * Youngsville * Zebulon