When does the use of the Internet become ABUSE - and how can we identify the spiders that use the Internet for non-work related reasons.
If a company executive is concerned about Internet usage in the workplace, they should ask a key question: What distinguishes employees who frequently surf the web ("spiders") from their non-surfing colleagues, and which websites capture their attention (and time)?
The executive must take several steps. First, they need to identify the most frequently visited websitesthe favorites. Next, they must determine which employees are visiting these sites. However, the issue of "Internet abuse" isn't that simple. Its possible that 50% of those visiting the most frequented sites are actually some of the company's most effective employees. Thus, intense Internet usage might not reflect personal web surfing but could relate to job responsibilities. Simply tracking the number of visits wont reveal the true extent of non-work-related surfing. To address this, we need a different approachone that accounts for human behavior.
Identifying Surfing Patterns
We assume that employees tend to visit one or two favorite websites and then continue surfing through the links on those pages. This is the typical behavior of what well call Internet spiders. We will define employees visiting favorite sites as active users. A key question then arises: What is the nature of their relationship with these sites?
Defining an Active User
First, we need to define an active Internet user. For example, we could consider any employee who visits at least 10 websites as active. However, this alone is insufficient. Its crucial that these employees themselves identify which websites are their favoritesthose that a relatively large number of active users frequently visit. This creates a dual relationship: favorite websites define active users, and active users define favorite websites. By examining the content of these favorite sites, we can better understand the users' behavior.
This method filters out occasional surfers, focusing only on those employees who consistently visit certain sites. It also eliminates employees who visit a small number of websites for legitimate work purposeswhat well call limited spiders. For example, if we set the threshold for an active user at 10 sites, we can analyze how many times a site must be visited by these users to become a favorite. Conversely, by observing how many favorite websites an employee visits, we can classify them as an active user.
Surfing and Site Popularity
We measure the popularity of a website by assigning it a favor number, which reflects how many active users favor it. Similarly, we measure spider activity by calculating how many favorite sites an employee must visit to be considered an active user. The specific values for these measurements depend on factors such as the observation period and the total number of employees. We illustrate this concept below.
Analyzing Website Visits
Suppose the executive requests a report on visited websites
over a given period. During this time, someoneperhaps an administrative assistanttracks each instance an employee visits a particular site. Once enough data is gathered, it can be summarized in a table, where websites are listed in rows, and employees are listed in columns. Each entry indicates that an employee has visited a specific website.
Using this data and our duality concept, the executive can discover who the active users are and what their favorite websites are.
Defining Expertise
Another way to identify active web users is to form an Internet Expert Committee. This committee would consist of employees who have visited specific websites at least four times within a predefined list of relevant sites. The expertise level of each committee member is determined by the number of different websites theyve visited on this list. One committee could have a higher expertise level if all its members have visited more websites than the members of another committee. The goal would be to find the committee with the highest expertise level.
Case Study
In the table below, we apply our duality concept to a dataset of 4,368 website visits by 183 employees over a period of 1.5 months. Out of the 4,368 sites, only 105 made it to the expert committee top listmeaning they were defined as favorites. The "favor number" of these sites is 4, while the "activity/expertise" number for employees is 53.
These 105 favorites were visited by just 8 employees, who are therefore classified as active users or expert committee members.
Websites | Empl.nr.1 | Empl.nr.2 | Empl.nr.3 | Empl.nr.4 | Empl.nr.5 | Empl.nr.6 | Empl.nr.7 | Empl.nr.8 |
Site nr.1 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.2 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.3 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.4 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.5 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.6 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.7 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.8 | x |
x |
x |
x |
x |
x |
x |
x |
Site nr.9 | x |
x |
x |
x |
- |
x |
x |
x |
Site nr.10 | x |
x |
x |
x |
- |
x |
x |
x |
Site nr.11 | x |
- |
x |
x |
x |
x |
x |
x |
Site nr.12 | x |
- |
- |
x |
x |
x |
x |
x |
Site nr.13 | x |
x |
x |
x |
x |
x |
x |
- |
Site nr.14 | x |
x |
x |
x |
x |
x |
x |
- |
Site nr.15 | x |
x |
x |
x |
x |
- |
x |
x |
Site nr.16 | x |
x |
x |
x |
- |
- |
x |
x |
Site nr.17 | x |
x |
x |
x |
x |
x |
- |
x |
Site nr.18 | x |
x |
x |
- |
- |
x |
x |
x |
Site nr.19 | x |
x |
- |
x |
- |
- |
x |
x |
Site nr.20 | x |
x |
- |
x |
x |
x |
- |
x |
Site nr.21 | x |
- |
x |
- |
x |
x |
x |
x |
Site nr.22 | x |
- |
x |
- |
- |
x |
x |
x |
Site nr.23 | x |
- |
x |
- |
- |
x |
x |
x |
Site nr.24 | x |
- |
x |
- |
- |
x |
x |
x |
Site nr.25 | x |
- |
x |
x |
- |
x |
- |
x |
Site nr.26 | x |
- |
- |
x |
- |
- |
x |
x |
Site nr.27 | x |
- |
- |
x |
x |
x |
- |
x |
Site nr.28 | x |
x |
x |
x |
x |
- |
x |
- |
Site nr.29 | x |
x |
x |
x |
- |
- |
x |
- |
Site nr.30 | x |
x |
x |
x |
x |
x |
- |
- |
Site nr.31 | x |
x |
x |
x |
x |
x |
- |
- |
Site nr.32 | x |
x |
x |
- |
- |
x |
x |
- |
Site nr.33 | x |
x |
x |
x |
x |
- |
- |
x |
Site nr.34 | x |
x |
x |
x |
x |
- |
- |
x |
Site nr.35 | x |
x |
x |
x |
x |
- |
- |
x |
Site nr.36 | x |
x |
- |
x |
x |
- |
x |
- |
Site nr.37 | x |
x |
x |
- |
- |
x |
- |
x |
Site nr.38 | x |
x |
- |
- |
- |
x |
x |
- |
Site nr.39 | x |
x |
- |
- |
x |
- |
x |
x |
Site nr.40 | x |
x |
- |
x |
x |
- |
- |
x |
Site nr.41 | x |
x |
- |
x |
x |
- |
- |
x |
Site nr.42 | x |
x |
- |
- |
x |
x |
- |
x |
Site nr.43 | x |
x |
- |
- |
x |
x |
- |
x |
Site nr.44 | x |
x |
- |
- |
x |
x |
- |
x |
Site nr.45 | x |
x |
- |
- |
- |
x |
- |
x |
Site nr.46 | x |
- |
x |
x |
- |
- |
x |
- |
Site nr.47 | x |
- |
x |
x |
- |
- |
x |
- |
Site nr.48 | x |
- |
x |
x |
- |
- |
x |
- |
Site nr.49 | x |
- |
x |
x |
- |
- |
x |
- |
Site nr.50 | x |
- |
x |
x |
- |
- |
x |
- |
Site nr.51 | x |
- |
x |
x |
x |
x |
- |
- |
Site nr.52 | x |
- |
x |
x |
x |
x |
- |
- |
Site nr.53 | x |
- |
x |
- |
- |
- |
x |
x |
Site nr.54 | x |
- |
x |
- |
- |
- |
x |
x |
Site nr.55 | x |
- |
x |
x |
- |
- |
- |
x |
Site nr.56 | x |
- |
- |
- |
x |
x |
x |
- |
Site nr.57 | x |
- |
x |
- |
- |
x |
- |
x |
Site nr.58 | x |
- |
- |
- |
x |
x |
- |
x |
Site nr.59 | x |
- |
- |
- |
x |
x |
- |
x |
Site nr.60 | x |
x |
x |
- |
x |
- |
x |
- |
Site nr.61 | x |
x |
x |
x |
x |
- |
- |
- |
Site nr.62 | x |
x |
x |
x |
x |
- |
- |
- |
Site nr.63 | x |
x |
x |
- |
- |
- |
x |
- |
Site nr.64 | x |
x |
x |
- |
x |
x |
- |
- |
Site nr.65 | x |
x |
x |
- |
x |
x |
- |
- |
Site nr.66 | x |
x |
x |
x |
- |
- |
- |
- |
Site nr.67 | x |
x |
x |
x |
- |
- |
- |
- |
Site nr.68 | x |
x |
x |
x |
- |
- |
- |
- |
Site nr.69 | x |
x |
x |
x |
- |
- |
- |
- |
Site nr.70 | x |
x |
x |
- |
- |
- |
- |
x |
Site nr.71 | x |
x |
- |
x |
x |
- |
- |
- |
Site nr.72 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.73 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.74 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.75 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.76 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.77 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.78 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.79 | x |
x |
- |
- |
x |
x |
- |
- |
Site nr.80 | x |
- |
x |
- |
x |
- |
x |
- |
Site nr.81 | x |
- |
x |
x |
x |
- |
- |
- |
Site nr.82 | x |
- |
x |
- |
x |
x |
- |
- |
Site nr.83 | x |
x |
x |
- |
x |
- |
- |
- |
Site nr.84 | - |
x |
x |
x |
x |
x |
x |
- |
Site nr.85 | - |
x |
x |
x |
- |
- |
x |
x |
Site nr.86 | - |
x |
- |
x |
- |
x |
x |
- |
Site nr.87 | - |
x |
- |
x |
x |
- |
x |
x |
Site nr.88 | - |
x |
- |
x |
x |
- |
x |
x |
Site nr.89 | - |
x |
- |
- |
- |
x |
x |
x |
Site nr.90 | - |
- |
x |
x |
- |
- |
x |
x |
Site nr.91 | - |
- |
x |
x |
- |
- |
x |
x |
Site nr.92 | - |
- |
x |
x |
- |
- |
x |
x |
Site nr.93 | - |
- |
- |
x |
x |
x |
- |
x |
Site nr.94 | - |
x |
x |
x |
x |
- |
x |
- |
Site nr.95 | - |
x |
x |
x |
- |
- |
x |
- |
Site nr.96 | - |
x |
- |
x |
x |
- |
x |
- |
Site nr.97 | - |
x |
- |
x |
x |
- |
- |
x |
Site nr.98 | - |
x |
- |
- |
x |
x |
- |
x |
Site nr.99 | - |
- |
x |
x |
x |
x |
- |
- |
Site nr.100 | - |
- |
x |
x |
x |
- |
- |
x |
Site nr.101 | - |
- |
x |
x |
x |
- |
- |
x |
Site nr.102 | - |
x |
x |
- |
x |
- |
x |
- |
Site nr.103 | - |
x |
x |
x |
x |
- |
- |
- |
Site nr.104 | - |
x |
x |
x |
x |
- |
- |
- |
Site nr.105 | - |
x |
x |
- |
x |
x |
- |
- |