कम्प्युटर एउटा अति नै उपयोगी विद्यूतीय उपकरण हो जसले छिटोछरितो रूपले दिइएको तथ्यांङ्क प्रशोधन गरी चाहिए जसरी प्रस्तुत गर्दछ। सानोदेखि ठूलोसम्म हरहिसाब छिनमै गर्ने, ठूलठूला रेकर्डहरू राख्ने र चाहिएका बेला हेर्न, झिक्न सकिने भएकाले दिनदिनै यसको लोकप्रियता बढीरहेको छ। टाइप गर्न होस् वा घरबसीबसी विश्वभर सुचना आदान प्रदान गर्न होस्, सिनेमा हेर्न होस् वा खेल खेल्न आधुनिक युगमा कम्प्युटर हरेक घरको अनिवार्य आवश्यकता हुन थालेको छ। त्यसैले यसबारे सामान्य जानकारी हासिल गर्न सबैका लागि आवश्यक
कम्प्युटर लोकप्रिय हुनाका कारणहरूः
Compatible (हरेक क्षेत्रमा प्रयोग हुनसक्ने)- यसले आवश्यकता अनुसार विभिन्न क्षेत्रको काम गर्नसक्ने बनाउन र विभिन्न क्षेत्रमा काम लगाउन सकिन्छ। सिंगो कम्प्युटर एक्लै चलाउन सकिन्छ भने विभिन्न अन्य उपकरणहरूलाई संचालन र नियन्त्रण गर्नमा पनि यसको उपयोग हुन्छ।
Efficient (प्रभावकारी)- यसले सानासाना देखि लिएर ठूला ठूला जटील र झन्झटिला कामहरू नथाकिकन निरन्तर गरिरहन सक्छ।
Reliablity (विश्वसनीय)- यसले कठीन र जटील कामहरू सही ढंगले गर्दछ। मानिसले गर्ने काममा अन्जानमै गल्तीहरू हुने सम्भावनाहरू धेरै हुन्छन् तर कम्प्युटरलाई सही तथ्याङ्क र सही निर्देशन दिएमा यसले गर्ने काममा गल्ती (Error) हुँदैन।
Storage & Quick Processing (भण्डारण र तीब्र तथ्याङ्क प्रशोधन)- यसले धेरै तथ्यांङ्क भण्डार गर्ने र चाहिएको बेलामा आवश्यकता अनुसार छानेर झट्टै उपलब्ध गराउन सक्छ।
आधुनिक जमानामा सूचना प्रणाली (Information Technology) को विकास भै विश्वब्यापी रूपमा यसले बजार मात्र लिन सकेको होइन कि ब्यापक लोकप्रियता पनि हासिल गर्नसक्यो। इन्टरनेटको आविस्कारबाट इ-कमर्स (विद्यूतिय ब्यापार) सम्भव भयो। अब मानिसले घर बसी-बसी विश्वको कुनैपनि ठाउँबाट चाहिएको बस्तु हेर्न, मोलतोल गर्न र किन्न पनि सक्ने भयो। इन्टरनेट र इमेलको प्रयोगबाट छोटो समयमा, सस्तो मोलमा विश्वका कुना-कुनामा चिठीपत्र-समाचार आदान प्रदान गर्न मात्र होइन अहिले त इन्टरनेट फोन मार्फत लोकल फोन सरह अमेरिका, अष्ट्रेलिया आदि टाढाटाढाका देशहरुमा फोन गर्न सकिने भएको छ। त्यसैले आधुनिक युगका नरनारी सूचना प्रणाली तर्फआकर्षित हुन थाले र विश्व बजारका ब्यापारीहरु पनि यसै क्षेत्रमा लगानी गर्न हात धोएर लाग्न थाले। यी सबैको पछाडी एउटै कुरा थियो कम्प्यूटर। भू उपग्रहभित्र र पृथ्वीमा रहने भू उपग्रह केन्द्रको कन्ट्रोल रुममा रहने कम्प्यूटरले एक सेकेण्डको सयौं भागमा सूचना आदान प्रदान गर्ने भएकोले अत्याधुनिक सूचना प्रविधिहरुको विकास गर्न सम्भव भयो।
कम्प्युटरमा के के हुन्छ -
कम्प्युटरमा मुख्यतः ३ अंग हुने गर्छन्।
इनपुट (Input) -तथ्यांक हाल्ने माध्यम।
प्रोसेसर (Processor)- कार्य सम्पादन कक्ष।
आउटपुट (Output) -निस्कने माध्यम।
इनपुट डिभाइस मा किबोर्ड, माउस, स्क्यानर, माइक्रोफोन, जोएस्टीक पर्दछन्। यी डिभाइसहरूबाट कम्प्युटरलाई काम अह्राउने र तथ्यांक हाल्ने गरिन्छ।
प्रोसेसिङ (सम्पादन कक्ष) मा सिपियू बाकस पर्दछ। प्रोसेसरले अह्राए अनुसार तथ्यांकलाई प्रशोधन गर्ने गर्दछ।
आउटपुट डिभाइस (निस्कने माध्यम) का माध्यमबाट कम्प्युटरले प्रशोधन गरेका तथ्यांकहरू हेर्न वा झिक्न सकिन्छ। मोनिटर प्रिण्टर, साउण्ड बक्स आदिलाई आउटपुट डिभाइस का रूपमा लिन सकिन्छ।
मोनिटर (मोनिटर)
यो कम्प्युटरको स्क्रीन हो। कम्प्युटरमा के काम भैरहेको छ, के काम गर्न के निर्देशन (Command) दिनु पर्छ भन्ने हेर्न मोनिटर प्रयोग हुन्छ। मोनिटर सिपियू बाकसको VGA (Display कार्ड) मा डाटा केबुल (VGA केबुल) का माध्यमबाट जोडिएको हुन्छ। मोनिटरको लागि छुट्टै बिजुली चाहिने भएकाले यसमा एउटा पावर केबुल पनि जोडिएको हुन्छ। कुनै मोनिटरमा सिपियूको पावर सप्लाई यूनिट बाटै पावर केबुल जोड्न मिल्छ भने कुनै मोनिटर सीधै बिजुलीको प्लगमा जोडिन्छ। साधारण मोनिटर टेलिभिजन सेट जस्तै क्याथड्रे ट्युब प्रयोग गरी बनाइएको हुन्छ। आजकल एल.सी.डी. डिस्प्ले भएका फ्ल्याट र प्लाज्मा मोनिटरको प्रयोग पनि बढ्दै गैरहेको छ। कुनैकुनै मोनिटरबाट नै कम्प्युटरमा निर्देशन (Command) दिन मिल्छ जसलाइ टचस्क्रीन भनिन्छ। मोनिटरलाई भि.डि.यू. (Visual Display Unit) पनि भनिन्छ।
सि पि यू बाकस (CPU Box) वा क्याविनेट
क्याविनेट वा केसिङ सबैतिर बन्द गरिएको बाकस आकारको हुन्छ। साधारण बोलचालको भाषामा कम्प्युटरको मुख्य बाकसलाई सि पि यू बाकस भन्ने गरिन्छ। कम्प्युटरका इनपुट र आउटपुट का सबै यन्त्रहरू यसमा आवद्ध हुन्छन्। त्यसैले कम्प्युटरको सबैभन्दा महत्वपूर्ण भाग पनि यही नै हो। कम्प्युटरको दिमाग नै सि पि यू अर्थात् केन्द्रीय सम्पादन इकाइ हो। छोटकरीमा यसलाई प्रोसेसर मात्र पनि भनिन्छ। यो सिपियू बाकसभित्र रहेको मदरबोर्ड मा जोडिएको हुन्छ। यसैको गति (CPU Speed) का आधारमा कम्प्युटरको क्षमता निर्भर रहन्छ। हामीले अह्राए अनुसार प्रोसेसरले तथ्यांकलाई प्रशोधन गर्ने गर्दछ।
कम्प्युटर विकास क्रमको संक्षिप्त इतिहास
करिब इशापूर्व ३००० तिर भएको एवाकसको आविस्कार कम्प्यूटरको प्रारम्भिक रूप मानिएको छ। पछि अंकको प्रयोग हुनथालेपछि हिसाब किताबको विकास अत्याधिक भएको मानिन्छ। सन् १६१४ मा स्कटल्याण्डका जोन नेपियरले नेपियर्स बोन, १६२० मा विलियम ओज्ट रेड ले स्लाइड रुल, १६४२ मा फ्रान्सका पास्कलाइनले पास्कलाइन नामक जोड घटाउ यन्त्र, १६७१ मा लेबनाइट्ज ले लेबनाइट्ज क्यालकुलेटर तथा सन् १८२२ मा बेलायतका चार्ल्स बेबेजले डिफरेन्स इन्जिन र एनालिटिकल इन्जिनको आविस्कार गरेका थिए। तर यी सबै कार्यहरु कम्प्यूटरको पूर्णताबाट बाहिर थिए। सन १८८० मा अमेरिकाको जनगणना सर्बप्रथम प्रयोग शुरु गरिएको रेकर्डहरु प्रशोधन गर्ने कार्यमा धेरै समस्या देखिएकाले अमेरिका बसाईं सरेका जर्मन डा. हर्मन होलेरिथले सन् १८८७ तिर प्रतिमिनट २०० वटा कार्डहरु पढेर छुट्याउन सक्ने Punched कार्ड Tabulating Machine बनाए जसका लागी उनीले सम्मान पनि पाएका थिए। अन्तमा सन् १९३६ मा Howard Akin ले MARK-I नामको अटोमेटिक हिसाब गर्ने यन्त्र बनाएपछि यस क्रमले नयाँ मोड लियो। त्यस्तै १९४२ मा जोन भी एटानासफ्टले ABC (Atanasoft Berry Computer), १९४३ मा जुसे कोनरेडले Z-3, १९४४ मा हावर्ड एकिनले MARK-II कम्प्यूटर बनाएका थिए तापनि यी कम्प्यूटरहरु अति अजंगका थिए। यी सबै Electro-mechanical Computer थिए।
अन्तत, १९४४ मा हंगेरीका गणितज्ञ जोन भोन न्यू मेनले Edvac नामको कम्प्यूटरको परिकल्पना गरी कार्य गरे पनि पूर्ण रूपमा सफल हुन सकेनन्। सन् १९५२ मा जे प्रेस्पर एकर्ट र जोन मौच्ले ले संयुक्त रूपमा Univac कम्प्यूटर तयार गरे। यी सबै प्रथम पुस्ताका कम्प्यूटरहरु थिए। आधुनिक कम्प्यूटरको शुरुवात यसै लाइ मानिएको छ। त्यसपछि दोश्रो, तेश्रो, तथा चौथो पुस्ताका कम्प्यूटरहरु को आविस्कार भयो। अहिले बजारमा IBM PC, Apple Macintosh, Acer, Siemens, Compaq, Dell, Gateway आदि नामका कम्प्यूटरहरु पाइन्छन्। नेपालमा पनि मर्केन्टाइल अफिस सिस्टमले मर्केन्टाइल पि सी र बेल्ट्रोनिक्स ट्रेडर्सले बेल्ट्रोनिक्स पिसी नामका कम्प्युटर उत्पादन गर्न थालेका छन्।
हाल पाइने कम्प्यूटरहरुमा अत्याधुनिक कम्प्यूटर भनेर माइक्रो कम्प्यूटर पेन्टीयम IV लाइ मान्न सकिन्छ। यी कम्प्यूटरहरुमा अति तीब्र गतिमा काम गर्नसक्ने माइक्रो प्रोसेसरको प्रयोग गरिएको हुन्छ। यस्ता कम्प्युटरलाइ जुनसुकै सफ्टवेयर राखेर पनि काम गर्नसक्ने बनाइएको हुन्छ। कम्प्यूटरमा नेटवर्क तथा इन्टरनेटको सहायताले ब्याङ्क तथा वित्तीय संस्थाहरुले केही मिनटमैं विदेशमा भएको पैसा नेपालमा भुक्तान दिने प्रविधिको विकास भएको छ, त्यसैले ब्याङ्कहरुले यसको प्रयोग अत्याधिक रूपमा गर्नथालेका छन्। यसका अतिरिक्त एयर टिकटिङ्ग, ट्राभल एजेन्सी हरुले आफ्नो ब्यवसायमा कम्प्यूटरको प्रयोग बाट अत्याधिक लाभ लिन सफल भैरहेका छन्। वास्तवमा भन्दा सूचना, संञ्चार, यातायात, शिक्षा, स्वास्थ्य, रक्षा आदि लगायतका सबै क्षेत्रमा कम्प्यूटरको ब्यापक प्रयोग भएको छ।
नेपालमा कम्प्युटरको इतिहास
नेपालमा कम्प्यूटरको प्रयोगको शुरुवात जनगणनाको क्रममा केन्द्रिय तथ्यांक विभागले वि.सं. २०१८ मा FACIT नामक क्यालकुलेटिङ्ग मेशीन प्रयोग गरेर गरेको थियो। त्यसबाट विभिन्न त्रुटीहरु देखिएकोले राष्ट्रलाइ ठूलो नोक्सान हुनपुग्यो। त्यसैले २०२८ को जनगणनाको लागि तत्कालिन सरकारले प्रतिमहिना रु १ लाख २५ हजार भाडा तिर्नेगरी IBM 1401 नामको दोश्रो पुस्ताको कम्प्यूटर प्रयोग गरेको थियो। यसको प्रयोग गरी १ करोड १३ लाख जनताको तथ्याङ्क प्रोसेस गर्न १ बर्ष ७ महिना १५ दिन लागेको थियो। कम्प्यूटरका सम्बन्धित सबै काम गर्नको लागि २०३१ पौषमा यान्त्रिक सारिणीकरण केन्द्र खोलियो र ०३७ सालतिर त्यसैलाइ राष्ट्रिय कम्प्यूटर केन्द्र नामाकरण गरीयो। २०३८ मा जनगणनाको लागि बेलायती कम्प्यूटर ICL 2950/10 ल्याइयो उक्त कम्प्यूटर UNDP/UNFPA को सहयोगमा रु २० लाख डलरमा प्राप्त भएको थियो जसबाट जनगणनाको कार्य १ बर्ष३ महिनामा सकिएको थियो। ०३९ तिर काठमाण्डौंमा विभिन्न कम्प्यूटरका निजी संस्थाहरु खुले। ०४० को दशक देखिनै देशभरमा कम्प्यूटरको कार्यक्षेत्र निकै विस्तार भैसकेको थियो। त्यसैबेलादेखि कम्प्यूटरको दक्ष जनशक्ति पनि उत्पादन गर्ने हेतुले राष्टि्य कम्प्यूटर केन्द्रबाट तालिमको कार्यक्रम पनि संचालन गर्न थालियो।
थोरै ठाँउमा धेरै तथ्यांङ्क राख्न सकिने र सो तथ्याङ्क चाहिएको समयमा एकैछिनमा भेट्न सकिने भएको हुँदा तथ्याङ्क राख्नुपर्ने संस्थाहरु जस्तै प्रवेशिका परिक्षा बोर्ड, कर्मचारी संञ्चयकोष आदिले कम्प्यूटरको प्रयोग ब्यापक रूपमा गर्न थालेपछि यसको पेशागत रूपमा प्रयोग हुनथाल्यो। त्यसैगरी विभिन्न ब्याङ्क तथा बित्तीय संस्थाहरुले पनि कम्प्यूटरको ब्यापक रूपमा प्रयोग गर्दै आएका छन्। अहिले त जुनसुकै विषयको कार्यालय वा ब्यापार संस्था खोलियोस्, फर्निचर पछाडी लगत्तै कम्प्यूटरको आवश्यकता महसुस गरिन्छ। कम्प्यूटर डाटा प्रोसेसिङ्गमा मात्र नभइ डकुमेण्टेशन गर्न, पत्रपत्रिकाको लेआउट डिजाइन गर्न आदि कार्यहरुमा पनि प्रयोग गरिन्छ। कम्प्यूटरको प्रयोगले दूरसंञ्चार, हवाइ यातायात जस्ता अत्यन्त प्राविधिक क्षेत्रहरूपनि जटिलताबाट सरलतातिर उन्मुख भएका छन्। नेपालको दूरसंञ्चार क्षेत्रमा स्वीचिङ्ग, ट्रान्समिसन, इन्क्वाइरी, विलिङ्ग पूर्णतया कम्प्यूटराइज्ड भएकोले हाम्रो देशका सबै टेलिफोन सिस्टम डिजिटलाइज्ड छन्। कम्प्यूटरको प्रयोगले नयाँ सिस्टम जडान गर्न अथवा पुरानो सिस्टममा कठीनाइ आएमा मर्मत गर्न एकदमै सजिलो भएको छ। कम्प्युटरको आविष्कारले मानिसलाई हरेक कुरामा सहयोग पुर्याएको छ।
स्रोतः विकिपेडिया
Wednesday, July 25, 2007
Friday, July 20, 2007
How 3-D PC Glasses Work
Only a few years ago, seeing in 3-D meant peering through a pair of red-and-blue glasses, or trying not to go cross-eyed in front of a page of fuzzy dots. It was great at the time, but 3-D technology has moved on. Scientists know more about how our vision works than ever before, and our computers are more powerful than ever before -- most of us have sophisticated components in our computer that are dedicated to producing realistic graphics. Put those two things together, and you'll see how 3-D graphics have really begun to take off.
Most computer users are familiar with 3-D games. Back in the '90s, computer enthusiasts were stunned by the game Castle Wolfenstein 3D, which took place in a maze-like castle. It may have been constructed from blocky tiles, but the castle existed in three dimensions -- you could move forward and backward, or hold down the appropriate key and see your viewpoint spin through 360 degrees. Back then, it was revolutionary and quite amazing. Nowadays, gamers enjoy ever more complicated graphics -- smooth, three-dimensional environments complete with realistic lighting and complex simulations of real-life physics grace our screens. But that's the problem -- the screen. The game itself may be in three dimensions, and the player may be able to look wherever he wants with complete freedom, but at the end of the day the picture is displayed on a computer monitor...and that's a flat surface.
That's where PC 3-D glasses come in. They're designed to convince your brain that your monitor is showing a real, three-dimensional object. In order to understand quite how this works, we need to know what sort of work our brain does with the information our eyes give it. Once we know about that, we'll be able to understand just how 3-D glasses do their job.
Most computer users are familiar with 3-D games. Back in the '90s, computer enthusiasts were stunned by the game Castle Wolfenstein 3D, which took place in a maze-like castle. It may have been constructed from blocky tiles, but the castle existed in three dimensions -- you could move forward and backward, or hold down the appropriate key and see your viewpoint spin through 360 degrees. Back then, it was revolutionary and quite amazing. Nowadays, gamers enjoy ever more complicated graphics -- smooth, three-dimensional environments complete with realistic lighting and complex simulations of real-life physics grace our screens. But that's the problem -- the screen. The game itself may be in three dimensions, and the player may be able to look wherever he wants with complete freedom, but at the end of the day the picture is displayed on a computer monitor...and that's a flat surface.
That's where PC 3-D glasses come in. They're designed to convince your brain that your monitor is showing a real, three-dimensional object. In order to understand quite how this works, we need to know what sort of work our brain does with the information our eyes give it. Once we know about that, we'll be able to understand just how 3-D glasses do their job.
Tuesday, July 10, 2007
How Spyware Works
as a prize-notification pop-up window.
Spyware is a category of computer programs that attach themselves to your operating system in nefarious ways. They can suck the life out of your computer's processing power. They are designed to track your Internet habits, nag you with unwanted sales offers or generate traffic for their host Web site. According to recent estimates, more than two-thirds of all personal computers are infected with some kind of spyware [ref]. But before you chuck your computer out the window and move to a desert island, you might want to read on. In this article we'll explain how spyware gets on your computer, what it does there and how to get rid of it.
Some people mistake spyware for a computer virus. A computer virus is a piece of code designed to replicate itself as many times as possible, spreading from one host computer to any other computers connected to it. It usually has a payload that may damage your personal files or even your operating system.
Spyware, on the other hand, is generally not designed to damage your computer. Spyware is broadly defined as any program that gets into your computer without permission and hides in the background while it makes unwanted changes to your user experience. The damage it does is more a by-product of its main mission, which is to serve you targeted advertisements or make your browser display certain sites or search results.
At present, most spyware targets only the Windows operating system. Some of the more notorious spyware companies include Gator, Bonzi Buddy, 180 Solutions, DirectRevenue, Cydoor, CoolWebSearch, Xupiter, XXXDial and Euniverse.
Next, we'll look at the different ways that spyware can get onto computer.
Some people mistake spyware for a computer virus. A computer virus is a piece of code designed to replicate itself as many times as possible, spreading from one host computer to any other computers connected to it. It usually has a payload that may damage your personal files or even your operating system.
Spyware, on the other hand, is generally not designed to damage your computer. Spyware is broadly defined as any program that gets into your computer without permission and hides in the background while it makes unwanted changes to your user experience. The damage it does is more a by-product of its main mission, which is to serve you targeted advertisements or make your browser display certain sites or search results.
At present, most spyware targets only the Windows operating system. Some of the more notorious spyware companies include Gator, Bonzi Buddy, 180 Solutions, DirectRevenue, Cydoor, CoolWebSearch, Xupiter, XXXDial and Euniverse.
Next, we'll look at the different ways that spyware can get onto computer.
How Facial Recognition Systems Work
Anyone who has seen the TV show "Las Vegas" has seen facial recognition software in action. In any given episode, the security department at the fictional Montecito Hotel and Casino uses its video surveillance system to pull an image of a card counter, thief or blacklisted individual. It then runs that image through the database to find a match and identify the person. By the end of the hour, all bad guys are escorted from the casino or thrown in jail. But what looks so easy on TV doesn't always translate as well in the real world. In 2001, the Tampa Police Department installed cameras equipped with facial recognition technology in their Ybor City nightlife district in an attempt to cut down on crime in the area. The system failed to do the job, and it was scrapped in 2003 due to ineffectiveness. People in the area were seen wearing masks and making obscene gestures, prohibiting the cameras from getting a clear enough shot to identify anyone. Boston's Logan Airport also ran two separate tests of facial recognition systems at its security checkpoints using volunteers. Over a three month period, the results were disappointing. According to the Electronic Privacy Information Center, the system only had a 61.4 percent accuracy rate, leading airport officials to pursue other security options.>
In this article, we will look at the history of facial recognition systems, the changes that are being made to enhance their capabilities and how governments and private companies use (or plan to use) them. Humans have always had the innate ability to recognize and distinguish between faces, yet computers only recently have shown the same ability. In the mid 1960s, scientists began work on using the computer to recognize human faces. Since then, facial recognition software has come a long way.
Identix®, a company based in Minnesota, is one of many developers of facial recognition technology. Its software, FaceIt®, can pick someone's face out of a crowd, extract the face from the rest of the scene and compare it to a database of stored images. In order for this software to work, it has to know how to differentiate between a basic face and the rest of the background. Facial recognition software is based on the ability to recognize a face and then measure the various features of the face.
In this article, we will look at the history of facial recognition systems, the changes that are being made to enhance their capabilities and how governments and private companies use (or plan to use) them. Humans have always had the innate ability to recognize and distinguish between faces, yet computers only recently have shown the same ability. In the mid 1960s, scientists began work on using the computer to recognize human faces. Since then, facial recognition software has come a long way.
Identix®, a company based in Minnesota, is one of many developers of facial recognition technology. Its software, FaceIt®, can pick someone's face out of a crowd, extract the face from the rest of the scene and compare it to a database of stored images. In order for this software to work, it has to know how to differentiate between a basic face and the rest of the background. Facial recognition software is based on the ability to recognize a face and then measure the various features of the face.
Every face has numerous, distinguishable landmarks, the different peaks and valleys that make up facial features. FaceIt defines these landmarks as nodal points. Each human face has approximately 80 nodal points. Some of these measured by the software are:
Distance between the eyes Width of the nose Depth of the eye sockets The shape of the cheekbones The length of the jaw line These nodal points are measured creating a numerical code, called a faceprint, representing the face in the database. FaceIt software compares the faceprint with other images in the database.
In the past, facial recognition software has relied on a 2D image to compare or identify another 2D image from the database. To be effective and accurate, the image captured needed to be of a face that was looking almost directly at the camera, with little variance of light or facial expression from the image in the database. This created quite a problem.
In most instances the images were not taken in a controlled environment. Even the smallest changes in light or orientation could reduce the effectiveness of the system, so they couldn't be matched to any face in the database, leading to a high rate of failure. In the next section, we will look at ways to correct the problem.
Monday, July 9, 2007
Introduction to How Internet Search Engines Work #2
When most people talk about Internet search engines, they really mean World Wide Web search engines. Before the Web became the most visible part of the Internet, there were already search engines in place to help people find information on the Net. Programs with names like "gopher" and "Archie" kept indexes of files stored on servers connected to the Internet, and dramatically reduced the amount of time required to find programs and documents. In the late 1980s, getting serious value from the Internet meant knowing how to use gopher, Archie, Veronica and the rest.
Today, most Internet users limit their searches to the Web, so we'll limit this article to search engines that focus on the contents of Web pages.
An Itsy-Bitsy Beginning
Before a search engine can tell you where a file or document is, it must be found. To find information on the hundreds of millions of Web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites. When a spider is building its lists, the process is called Web crawling. (There are some disadvantages to calling part of the Internet the World Wide Web -- a large set of arachnid-centric names for tools is one of them.) In order to build and maintain a useful list of words, a search engine's spiders have to look at a lot of pages.
How does any spider start its travels over the Web? The usual starting points are lists of heavily used servers and very popular pages. The spider will begin with a popular site, indexing the words on its pages and following every link found within the site. In this way, the spidering system quickly begins to travel, spreading out across the most widely used portions of the Web.
"Spiders" take a Web page's content and create key search words that enable online users to find pages they're looking for.
Google.com began as an academic search engine. In the paper that describes how the system was built, Sergey Brin and Lawrence Page give an example of how quickly their spiders can work. They built their initial system to use multiple spiders, usually three at one time. Each spider could keep about 300 connections to Web pages open at a time. At its peak performance, using four spiders, their system could crawl over 100 pages per second, generating around 600 kilobytes of data each second.
Keeping everything running quickly meant building a system to feed necessary information to the spiders. The early Google system had a server dedicated to providing URLs to the spiders. Rather than depending on an Internet service provider for the domain name server (DNS) that translates a server's name into an address, Google had its own DNS, in order to keep delays to a minimum.
When the Google spider looked at an HTML page, it took note of two things:
The words within the page
Where the words were found
Words occurring in the title, subtitles, meta tags and other positions of relative importance were noted for special consideration during a subsequent user search. The Google spider was built to index every significant word on a page, leaving out the articles "a," "an" and "the." Other spiders take different approaches.
These different approaches usually attempt to make the spider operate faster, allow users to search more efficiently, or both. For example, some spiders will keep track of the words in the title, sub-headings and links, along with the 100 most frequently used words on the page and each word in the first 20 lines of text. Lycos is said to use this approach to spidering the Web.
Other systems, such as AltaVista, go in the other direction, indexing every single word on a page, including "a," "an," "the" and other "insignificant" words. The push to completeness in this approach is matched by other systems in the attention given to the unseen portion of the Web page, the meta tags.
Meta Tags
Meta tags allow the owner of a page to specify key words and concepts under which the page will be indexed. This can be helpful, especially in cases in which the words on the page might have double or triple meanings -- the meta tags can guide the search engine in choosing which of the several possible meanings for these words is correct. There is, however, a danger in over-reliance on meta tags, because a careless or unscrupulous page owner might add meta tags that fit very popular topics but have nothing to do with the actual contents of the page. To protect against this, spiders will correlate meta tags with page content, rejecting the meta tags that don't match the words on the page.
All of this assumes that the owner of a page actually wants it to be included in the results of a search engine's activities. Many times, the page's owner doesn't want it showing up on a major search engine, or doesn't want the activity of a spider accessing the page. Consider, for example, a game that builds new, active pages each time sections of the page are displayed or new links are followed. If a Web spider accesses one of these pages, and begins following all of the links for new pages, the game could mistake the activity for a high-speed human player and spin out of control. To avoid situations like this, the robot exclusion protocol was developed. This protocol, implemented in the meta-tag section at the beginning of a Web page, tells a spider to leave the page alone -- to neither index the words on the page nor try to follow its links.
Introduction to How Internet Search Engines Work #1
The good news about the Internet and its most visible component, the World Wide Web, is that there are hundreds of millions of pages available, waiting to present information on an amazing variety of topics. The bad news about the Internet is that there are hundreds of millions of pages available, most of them titled according to the whim of their author, almost all of them sitting on servers with cryptic names. When you need to know about a particular subject, how do you know which pages to read? If you're like most people, you visit an Internet search engine.
Internet search engines are special sites on the Web that are designed to help people find information stored on other sites. There are differences in the ways various search engines work, but they all perform three basic tasks:
They search the Internet -- or select pieces of the Internet -- based on important words.
They keep an index of the words they find, and where they find them.
They allow users to look for words or combinations of words found in that index.
Early search engines held an index of a few hundred thousand pages and documents, and received maybe one or two thousand inquiries each day. Today, a top search engine will index hundreds of millions of pages, and respond to tens of millions of queries per day. In this article, we'll tell you how these major tasks are performed, and how Internet search engines put the pieces together in order to let you find the information you need on the Web.
Saturday, July 7, 2007
How to Become a Hacker
This article is based on an essay I wrote in December of 1998.
Looking for advice on learning to crack passwords, sabotage systems, mangle websites, write viruses, and plant Trojan horses? You came to the wrong place. I'm not that kind of hacker.
Looking for advice on how to learn the guts and bowels of a system or network, get inside it, and become a real expert? Maybe I can help there. How you use this knowledge is up to you. I hope you'll use it to contribute to computer science and hacking (in its good sense), not to become a cracker or vandal.
This little essay is basically the answers to all the emails I get asking how to become a hacker. It's not a tutorial in and of itself. It's certainly not a guaranteed success. Just give it a try and see what happens. That said, here's where to start:
read more
Looking for advice on learning to crack passwords, sabotage systems, mangle websites, write viruses, and plant Trojan horses? You came to the wrong place. I'm not that kind of hacker.
Looking for advice on how to learn the guts and bowels of a system or network, get inside it, and become a real expert? Maybe I can help there. How you use this knowledge is up to you. I hope you'll use it to contribute to computer science and hacking (in its good sense), not to become a cracker or vandal.
This little essay is basically the answers to all the emails I get asking how to become a hacker. It's not a tutorial in and of itself. It's certainly not a guaranteed success. Just give it a try and see what happens. That said, here's where to start:
read more
Subscribe to:
Posts (Atom)