Information
Trouble Ticket
Windows
Macintosh
UNIX
Supercomputing
Remote Access
Software
Security
Storage
Site Map & Search
TSRI Home

TSRI - IT Services - UNIX - FAQ

CLEAN UP YOUR ROOM 

[TIPS FOR ORGANIZING YOUR HOME DIRECTORY]

  by Mike Nguyen

Table of Contents

"Clean Up Your Room" does not discuss about your office or your working area. This document describes ways to improve the performance of your UNIX computing environment.

* * *

True Story

We have seen TSRI UNIX users who put every file, repeat - EVERY single FILE - in their home directories. There were no subdirectory! There are hundreds of files stored in one directory. Whenever the UNIX command "ls" is executed, it takes minutes to complete with the monitor flashing screenful after screenful of file names. The computer behaved painfully slowly. And it was very difficult for those UNIX users to find their wanted file.

If you have a habit to use only ONE or A FEW directories and jam pack many files in them, you may be throwing away your computer performance without even knowing it. This may happen frequently when you write programs to automatically process data or save data.

* * *

Rule of thumb #1: Keep Your Directories SMALL

Whenever you access a file, your computer Operating System must locate that file for you. In order to do so, your computer Operating System must find each component of the path, to know exactly where on the disk drive, your file is located.

- What is a path ?

Without being technical, path is an expression to show the location of a file. A path is made up by directory(ies) and file name, separated by forward slash (/).

- An absolute path starts with a forward slash (/).
Ex.:
/export/home/johndoe/project1/mydocument.text

- A relative path starts with no forward slash, or with one dot (.), or with two dots (..). Relative path always has a relationship with another directory, usually the "current directory".
Ex.:
project1/mydocument.text
./project1/mydocument.text
../project1/mydocument.text

- How your computer reads a file ?

Let us suppose you are starting to work on this file: /usr/local/mytest.

Your computer Operating System starts at the root directory (/) to find the entry for the "usr" directory. Next, from the "usr" directory, it looks for the "local" directory. Then from "local", it searches for the file "mytest".

If a directory is small, the directory may take only a few "sectors" of disk space ( If a hard disk is considered as a book, then sectors are considered as pages). However, when your computer searches through a large directory that consumes more space, it has to read more directory entries from the disk and to spend more time looking for the entry you want.

These disk reads are very expensive - speaking in computer language. It means the readings from disk are very time consuming. If you are trying to read a tiny file which is located in a badly structured directory which holds hundreds of files, your computer Operating System might spend more time looking for the file than it would reading the file for you.

- What is the limit?

So, how many files can you place in a subdirectory before performance penalties accrue? We cannot give you a definite number answer due to "your mileage may vary". Only you can determine what performance trade-offs you are willing to make.

Unless you have reason to do otherwise, we usually keep our files within one or two screenful limit. For the organization and performance reasons, create subdirectories, then arrange your files with some kind of systematic planning, creating a unified structure for these different subdirectories.

For example,
Instead of keeping all files in one directory, you create several subdirectories for different topics, or projects, or period of time, so on...

You and your computer will find your files much easier and faster.

* * *

Rule of thumb #2: Keep Your File Names Short

It is a good idea to create a file with name which will remind you about its content. But please do not overdo it. It is OK to create file with self-explanatory name like "CH3_to_NH2_lambda.out" , but it is an overkill to create hundreds of files with a line-long name like:

"in_xplrig_init_stepup_posit_group_b_auto_tk_autodock_nocharge_sub_agent_d3"

A directory containing files with short name will take less space than a directory containing files with a-line-long name, and will help performance.

Technicaly speaking, the computer Operating System has different ways to speed up file accesses by maintaining different levels of buffering. Frequently used disk sectors are cached in memory, because chances are good that the data will be needed again soon. Another level of buffering keeps information about the starting location of frequently used directories.

No matter how smart your Operating System is, your too-large directories and too-long-name files will put unnecessary burdens and loads on your system. Your computer may have to read multiple sectors from the hard drive, overflow its disk sector buffers and then discard old sector data to make room for the new one. The larger the directory, the more information gets discarded, slowing down all processes as your computer is forced to re-read sectors from the disk drives.

* * *

Conclusion

It may not matter if a directory is large if you use it only once in a while. However, the directories that you use often should be kept small to keep the system running at its peak. Typically, you can arrange such directories as a hierarchical structure, breaking it into several smaller sub-directories, and distributing the files among them in a systematic and structured fashion.

Remember, keeping your DIRECTORIES SMALL and your FILE NAMES SHORT can help you keep your data in an organized order and give you the highest possible computing performance.

BACK to TOC

* * *




Any question, bug report or other comment regarding this web page should be directed to
Michael (Mike) Nguyen at x4-9364. 
Back to IT Services-UNIX-FAQ  

 

Copyright © 2004 TSRI.