PWB filters

General TSS software questions and comments.

Moderators: Tyler, Scott, General Moderator

Michelle

PWB filters

Post by Michelle »

PWB uses an "in string" function to determine if the website is allowed or denied.

Sample URL file.

+all
-hotmail.com
-games

This would allow all website except for websites with "hotmail.com" and "games" anywhere in the URL. For instance the following sites would be denied.

http://www.hotmail.com
http://www.google.com?hotmail.com
http://www.games.com
http://www.anygames.com

When PWB parses the URL string it looks for a match in the string with the strings listed in the file. If the string is found in the URL string it is either allowed or denied based on the prefix of "+" or "-".

There are also a few circumstances that no URL is given. This is when a website opens a blank window before browsing, or on certain JavaScript commands. For these instances the following should be added to the URL file.

JavaScript
about:blank

The IP filter works in a similar fashion except the URL is converted to an IP address then compared to the IP file.

To enable the filters set the respective setting in the INI file to true.

[Security]
...
CheckURLAccess=False
CheckIPAccess=False
...

The file PWB parses for the filter is indicated in the respective INI settings.

[Files]
...
CheckURLFile=.\URL.txt
CheckIPFile=.\IP.txt
...

It is recommended the full path to the file is entered in these settings to avoid any current Windows directory issues.

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Using the filter files

Post by Scott »

You will need to add strings to the URL.txt file that match the URLs that are being denied. There is a balance that needs to be met that will give enough access allow what you want but deny what you don't. PWB uses "in string" logic to determine access. Here is a simple truth table to show this.

URL file:
-all
-other
+stuff

Truth Table:
www.whatever.com - denied
www.thisstuff.com - allowed
www.thisotherstuff.com - denied
www.anything.com/this?stuff - allowed
http://stuff.org - allowed

As you can see, as long as "stuff" is somewhere in the URL and "other" is not, the access is allowed. It is usually better to have the denies "-" before the allows "+", in the URL file.

If you enable the PWB history log and the access log...

[Security]
...
WriteHistoryFile=True
...
LogAccess=True
...

PWB will log the URLs that are being denied in the history file (make sure you use the full path)...

[Files]
...
HistoryFile=.\History.txt
...

You can use this information to find common bonds between the various web sites you need to allows access to.

Not to confuse the issue, there is the IP Filter. This filter converts the URL to the IP, then uses the same logic to determine access but on the IP.

IP File:
-all
-172.20.16.4
-172.20.14
+172.20

Truth Table:
172.20.15.3 - allowed
172.20.16.4 - denied
172.20.16.3 - allowed
172.20.14.3 - denied

--Scott

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

Here are some suggestions.

The "+all" allows everything, adding in URLs ("+sunlife-usa.com") is not needed, but it does not hurt.

PWB v2 converts the "/" into a "\" for non Internet related addresses such as "C:\", "D:\" and \\Server\Share, and PWB converts the "\" into a "/" for access on the Internet. You can run into unwanted denials if you have strings such as "C:/", or "D:/" in the URL filter files. If you remove these from your URL filter file it may clear up the unwanted denials.

If you want to keep the patrons on a specific site, use the "-all" to deny access to all web sites, then add in the sites you wish to allow access to. If the site use Java, adding "+JavaScript" and "+About:Blank", is a good idea.

--Scott

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

With the release of PWB v2.04 revision 4, regular expressions are now supported in the filter files.

There are many excellent examples of using regular expressions on the internet here is an example of a good one.

http://etext.lib.virginia.edu/helpsheets/regex.html

One problem you may encounter is the backslash ("\") has special meaning for regular expressions, so you should adjust your filter files accordingly.

To use the backslash back as a normal backslash for comparison reasons, preface it with the escape character, which is a backslash. It may sound confusing, but under normal circumstances the only time you need to use a backslash in the URL file is when blocking access to the local hard drives, so use a double backslash to block access.

For example, old URL file

...
-C:\
-D:\
...

Convert to:

...
-C:\\
-D:\\
...


--Scott
Last edited by Scott on Thu Dec 11, 2003 12:44 pm, edited 1 time in total.

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

If you are using the URL filter to prevent access to the C drive with -C:\\ in your filter file, you may have trouble if a web site attempts to use the Internet Explorer internal web resources such as the cancel page. You may want to add the resource into your URL filter file before the -C:\\.

...
+res://c:\windows\system32\shdoclc.dll/navcancl.htm
-C:\\
...

This will prevent an error if the web page tries to access the resource. Make sure your path is correct, or shorten the URL to +res://.

--Scott

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

You can control the types of files that are accessed by using regular expressions in the PWB URL filter. Use the regular expression to match the end of the URL to deny access to the type of file.

For example to prevent downloading of ZIP and EXE type files, put this in your URL filter file.

+all
-\.zip$
-\.exe$

The "+all" allows all URLs, the "\." interrupts the "." as literal, the "exe" or "zip" is the type of file, and the "$" is the regular expression to match the end.

This allows only URLs that end with ".zip", or ".exe" to be denied, while URLs that simply contain ".zip", or ".exe" to be allowed.

--Scott

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

To prevent patrons from adding the an allowed string onto the URL to gain access, use the regular expression begins with to match the beginning of the URL.

For example with the following URL filter file:

-all
+google.com

You could potentially access the TeamSoftware URL by using the following in the address bar.

www.teamsoftwaresolutions.com?google.com

To prevent this, change your URL filter file to use the begins with regular expression.

-all
+^http://www.google.com

This will restrict PWB to only URLs that begin with "http://www.google.com" and not URLs that simply contain "google.com"

--Scott

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

To use the asterisk (*) "wildcard" in a Regular Expression you need to follow a dot (.) specified followed by an asterisk (*). This will match any character for any length, but you will also need to account for no characters.

-all
...
+^http://Google\.com
+^http://.*\.Google\.com
...

Notice the backslash (\) before the second and third dot (.) characters, this designates the dots (.) as a literal instead of a "wildcard" character.

Here is another good tutorial on Regular Expressions.
http://www.regular-expressions.info/tutorial.html

--Scott
Last edited by Scott on Fri Jun 04, 2010 1:09 pm, edited 2 times in total.

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

Here is an application that can be used to check an address against your URL filter file.

http://www.teamsoftwaresolutions.com/fi ... rCheck.zip

--Scott

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

The following will limit PWB to all URLs that start with "http://" or "htpps://" and have a Top Level Domain that ends with ".gov".

-all
+^(https?)://[A-Za-z0-9\-\.]+\.gov

--Scott

JWessner
Observer
Observer
Posts: 3
Joined: Mon Oct 28, 2013 10:44 am
Location: Athens, Al

Re: PWB filters

Post by JWessner »

Will the -all, when used with drives work? I need to restrict people from shared drives on our network while allowing for usb drives to be recognized.

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Re: PWB filters

Post by Scott »

"-all" denies all URLs unless a URL is specifically allowed with "+".

You could also use the [Security]OnlyAccessHTTP=True or [Security]OnlyAccessInternet=True.

--Scott

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Re: PWB filters

Post by Scott »

Starting with PWB version 3.04.1 CEF you the following setting will only check the main frame of the page.

[Security]
CheckURLMainFrameOnly=True

This setting is not available in PWB IE.

--Scott

jmarkow1
Observer
Observer
Posts: 4
Joined: Thu Dec 20, 2018 2:29 pm

Re: PWB filters

Post by jmarkow1 »

Is there a way to disable a link already imbedded in the website? I.E On our libraries catalog page is a link to the children's catalog page; I need to disable that because once you click on that link, you go to the kids catalog and cannot go back(I don't know why, but it's designed that way) If anyone clicks that link, we have to shut down the browser and reopen. It's an embedded link not a url in the address bar

Scott
Site Admin
Site Admin
Posts: 2527
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Re: PWB filters

Post by Scott »

Add the URL as denied in your URL filter file.

For example:
-http://abcmouse.com/library_account

This will deny access to the URL.

--Scott

Post Reply