|
Chapter 3: File storage and structure
|
3.2 Document structure
|
|
The web is a very different
medium than the printed page. Data on the web is not
structured like a printed book, as links enable users
to go from one source of data to another in a completely
transparent way.
Readers must be able to
orientate themselves immediately, know that the data
can be trusted and find it easy to understand. Most
importantly it must be demonstrable that the document
is up-to-date.
For this reason it is important
to ensure that the document has been constructed correctly
with the user in mind.
|
| Use each checklist to ensure
that your web pages comply with these guidelines |
| 3.2.1 Checklist
and summary: Core guidance |
| Checklist |
- The document should include
the HTML title element
- All HTML pages within
the site should contain the department’s standard-text
navigation bar, which should be consistent throughout
the site
- The document must have
a meaningful heading at the very top
- A long document which
requires scrolling by the user should contain an informative
summary of 40/50 words, placed directly beneath the
heading
- If the document requires
it there should be a hyperlinked downloadable alternative
versions (eg, PDF or text) listed beneath the summary
description
- If the document to be
published is broken into smaller sections for the
web, there must be clear navigation to enable users
to go forward or backwards within the document
- If the web page is part
of a larger document there should be clear identification
of this, for example, Page 5 of 12
- Each document should
contain ‘published’ or ‘last updated’ date clearly
shown at the top or bottom of the document
|
| Summary |
Because web users can surf
using other pages or search facilities as a first step,
it cannot be guaranteed that they have come to your
document through the front door.
Therefore each page in your
website must have a self-contained identity and be capable
of being seen as the first page. It should contain context
orientating data for the user, such as:
- meaningful document headings;
- informative summaries;
- page numbering;
- document dating;
- consistent navigation
to the rest of your site.
|
| 3.2.2 Structuring documents
for the Web |
By using file downloads, plug-ins, graphics and hyperlinks,
there is no real limitation on what can be accomplished.
A site looked at as a whole may make complete sense and have
a form of logical construction, but this picture can look
somewhat different when a single page of that site is looked
at out of context. This is the situation that many web users
have to contend with on a daily basis.
There are many different ways of getting to a particular
piece of information, and going through a website’s homepage
to get to it is perhaps one of the less likely. Another site
may have linked to this information, a newspaper may have
given a particular page’s URL or a user may have located the
information using a search facility.
Any one of these situations may mean that a user is introduced
to your site without seeing your introduction. They will be
unfamiliar with your navigation system and may not necessarily
know who the owner of the information actually is.
The design of each page within a website must therefore be
consistent, usable, and immediately identifiable with the
information’s owner. Ultimately this means that each separate
page of information should be seen as an island, existing
on its own but identifiably part of a whole.
Each of these ingredients will affect the experience the
user has while in your care.
IMPORTANT:
Do not use splash
screens with automatic client-side redirection as an
introduction to your site or to a section of your site.
Some browsers still do not support this HTML feature
and will therefore not automatically redirect. This
just adds another click to a user’s journey through
your site.
|
The file name component of a URL will be displayed in the
address bar in the user’s web browser and will be exposed
in the status bar at the bottom of the browser window when
the user moves their pointing device onto a link in the document.
For this reason it is better to make file names indicative
of the document’s name or purpose. This will also help with
the general housekeeping of the website file structure, as
file names should be fairly obvious. For example, section
three of the corporate business plan for 2001 is far more
meaningful if called ‘businessplan01-03.htm’ rather
than ‘350165.htm’.
Where file names include dates these should conform to the
ISO and W3C standard, that is they should be in yyyy-mm-dd
format. This ensures that file names have meaning in the long
term and lists of such files will sort in a sensible order.
A number of conventions should be considered when naming
your web files, most of which may well be determined by your
hosting service:
- An ‘8.3’ file-naming configuration may be required
(eight digits for the file name, a full point and then the
required extension).
- File names should be in lower case, lessening the likelihood
of broken links or images as pages are moved from one system
to another because, for example, Linux file systems are
case sensitive but Windows NT’s is not.
- There should be no spaces in file names.
- Where the file name is split, use the hyphen (-) character,
and avoid use of the underscore ( _ ). It is easier to read
'corp-plan.htm' than 'corp_plan.htm'.
- Other forms of punctuation and special characters should
be avoided.
- File names should be kept short but should also be descriptive.
- The HTML file-type extension (.htm, .html,
.asp or whatever) should be used consistently throughout
the site. Note that the allowable extensions depend upon
the web server’s file system and the MIME type mappings
in the server software. Decisions such as this need to be
made in consultation with the server administrator.
Large files are time-consuming and expensive to load, particularly
for clients with slow modems. Large homepages have the added
disadvantage that the user cannot choose a more economical
means of accessing the site.
Different types of web page require different file size restrictions,
thus:
- Homepage total file sizes should not
exceed 40kB.
- Standard, informational page total file
sizes should not exceed 120kB.
- Special pages (such as reports, statistical
data, etc, where it is advantageous for the user to be able
to download the file in one transaction) total file sizes
should not exceed 300kB.
In all cases, if the file size exceeds our recommended maximum,
the user should be warned. Good internal navigation within
a larger publication is also required (this is covered in
section
4.2.4).
File size can be an issue because of the differing levels
of formatting data contained within them. Depending upon the
text-to-graphic ratio of a document, either a PDF or RTF file
may be considerably smaller or larger than the originating
document. Plain-text will always save with the smallest file
size as it contains the bare minimum of document structure
and no images at all.
Whenever these options are offered they should be listed
at the top of the HTML document, directly under the document
summary. The file size for each should be shown next to each
file to inform the user of potential download time.
If a proprietary format such as PDF or Microsoft Word is
used, a link to the reader software download site, using a
standard form of words, should be included next to the document.
Subsequent sections of a long document should have a link
to these downloadable formats at the very top, as many users
may not have been introduced to the document from its homepage.
| 3.2.5 Sequential structure |
A long document, for example an organisation’s business plan,
may be 80 pages long in its printed format. This document
would not transfer satisfactorily to the web as one file -
so it is better to break it up into manageable sections, which
may well translate into a number of separate HTML files reflecting
natural breaks in the content.
As stated earlier in this section, a user visiting your website
may not have been introduced to a document from its home page.
A link from another site, such as a search engine or another
department’s website, may have brought them to page 5 of a
10-page document. It must therefore be made obvious from the
very start, where they are and how they can access other information
within the document.
This can be accomplished by adding section numbering to the
top of the page, to illustrate each element of the document
in a linear sequence.
- Each section of a publication will be
accessible from any other section.
- The contents page of the document can
always be accessed.
- A user can choose simply to go to the
previous or the next section.
Consider providing a single file downloadable version of large
documents, suitable for reading off-line.
| 3.2.6 Document identification |
Any document or section of a document prepared for print will
invariably be larger than what can be physically displayed within
a web browser’s window at any one time. Most users will have
monitors with a screen size of 800 pixels by 600 pixels, although
many will have smaller screens.

Documents written in English are read left to right, top
to bottom. A document displayed on a browser will always have
the information contained in the top left-hand corner on screen
when the page is first delivered to the user. It is very important
to make best use of this space.
A browser has a scrolling function that enables the user
to move up and down a page. At worst, the need to scroll through
a document may discourage its use by users with a motor disability
who find it difficult to use a mouse.
Each page should include a selection of internal links to
different parts of the document that has been loaded. This
will enable users to access important information as soon
as they receive the page.
Each of the sections contained within the document should
include a Back to top button and internal hyperlink
so that a user can quickly return to the top of the displayed
page.
| 3.2.7 Information to
be included in every document |
A user should always be informed that each publication contained
within the organisation’s website is relevant and up-to-date.
This could be made obvious by the use of a document footer
within each of the files, whether they be in HTML, PDF, RTF,
XLS, CSV or text format.
Your approach could include the following information:
- the organisation’s name, so that the
user can identify the originators of the information;
- the document’s date of publishing/date
updated or a version number;
- the expiry date of the publication’s
relevance;
- an email address of the document’s owner
so that any discrepancies or comments regarding the publication
can be directed to the correct area.
| 3.2.8 Version control
of web documents |
Making information available on your website is a publishing
process and organisations have established procedures for
content approval prior to posting. Because of the nature of
web publishing some information, for example, press notices,
can be produced in a very short time and changes or updates
to existing web pages can sometimes be undertaken in a matter
of minutes. It is therefore important that users are aware
how current the information is that you are publishing.
Every web page should clearly display the date it was first
published on your website. And this date should subsequently
be updated if you amend the information on a page.
With information that is frequently updated web managers
should consider maintaining a version control record.
The simplest record would be to track the published version
as an HTML comment note - the HTML mark up which is ignored
by browsers but which can be seen by viewing the source code.
For example:
<!-- version control record of page /annex01.htm -->
<!-- first published 01/02/2001 -->
<!-- updated 21/02/2001, 06/03/2001, 09/08/2001 -->
The latest date recorded here should correspond with that
displayed via the browser.
Go back to 3.1
File storage and servers | Go next to 4.1
Markup standards
Top of page
| Back
to Contents
|