# ROUGH SET THEORY

Rough Set Theory (RST) has ben developed by Z. PAWLAK and coworkers in the 1980s.

An information system U(G,M) is a set of objects G and a set of attributes M and a binary relation gRm betwen the object g and the attribute m. Generally the relation R is an equivalence relation. For a given subset of attributes, the set of objects may be divided in equivalence classes (or indiscernibility classes).

Example (Table1):

a1a2a3a4
o1 1 1 1 1
o2 1 0 0 0
o3 1 1 0 1
o4 0 1 0 1
o5 1 0 1 0

The equivalence classes for the attribute {a1} are : {o1,o2,o3,o5}{o4}
The equivalence classes for the attribute {a4} are : {o1,o3,o4}{o2,o5}
For the subset of attributes {a1,a2,a4}, they are : {o1,o3}{o2,o5}{o4}

Lower and upper approximations
For each subset X U, one may define a lower approximation and an upper approximation.
The lower approximation is defined as the union of all equivalence classes which are fully included in that of X.
The upper approximation is defined as the union of all equivalence classes which have a non-empty intersection with that of X. In Table 1, let us consider the subset of attributes {a1,a2}. For the full set of objects A, the equivalence classes are: {o1,o3}{o2,o5}{o4} Considering now a subset of objects X ={o1,o2,o3} For the same subset of attributes {a1,a2}, the equivalence classes of X are: {o1,o3}{o2} The upper approximation of X is:      {o1,o3) + {o2,o5} = {o1,o2,o3,o5} The lower approximation of X is: {o1,o3}

Other important notions in RST are:

Reduct : the minimal set of attributes that preserve dissimilarity between objects. In other words, P Q is a reduct of Q, if P is minimal among all subsets of Q which generate the same classification as Q; the attributes within a reduct are independent, and none of them can be omitted for the description of Q. Reducts produce deterministic rules.

In Table 1, the equivalence classes for the set of attributes {a1,a2} are the same as for the set {a1,a2,a4}. Attribute {a2} is superfluous and {a1,a4} is a reduct of {a1,a2,a4}

Core : The intersection of all reducts of Q is called the core of Q, written as core(Q), and the elements of the core of Q are called indispensable for Q. If q Q, and q is not an element of any reduct of Q, then we call q redundant for Q. If Q = W we just speak of reducts of I, core of I, etc. The set of all reducts of I is denoted by Red(I). Reducts correspond to keys of a relational database.

Discrimination power : When an attribute is removed, the number of equivalence classes may remain unchanged; in that case, the attribute is said superfluous. However, in most cases, the number of equivalence classes decreases and the effect is stronger for the attributes that have the highest discrimination power. Based on this observation, a procedure has been implemented in Semana to calculate the discrimination power of each attribute. The whole DB is splitted randomly in two halves and the discrimination of each attribute is calculated for each half. Due to the random procedure, the operation is repeated until values reach an asymptote.

Concept Approximations based on RST (according to Saquer and Deogun):
The goal is to approximate a set of objects A, a set of attributes B or a pair {A,B} by the closest formal concepts.
Approximation of a set of objects by RST:
For a given subset of objects (A), the algorithm says whether it is feasible (i.e. if it is the extent of a formal concept) or it is definable (if it is included in the extent of a formal concept). If it is neither feasible nor definable, the algorithm gives the subconcept and the superconcept :
- subconcept : [a(b(A*)), b(A*)]
- superconcept : [a(b(A*)), b(A*)]
where A* et A* are respectively the lower and the upper approximation of A.
Approximation of a set of attributes by RST:
For a given subset of attributes (B), the algorithm says whether it is feasible (i.e. if it is the intent of a formal concept) or it is definable (if it is included in the intent of a formal concept). If it is neither feasible nor definable, the algorithm gives the subconcept and the superconcept :
- subconcept : [a(B*)), b(a(B*)]
- superconcept : [a(B*)), b(a(B*)]
where B* et B* are respectively the lower and the upper approximation of B.
Approximation of a pair {A,B} by RST:
For a given subset of objects (A) and subset of attributes (B), the algorithm says whether it is feasible (i.e. if it is a formal concept) or it is definable (if it is included in a formal concept). If it is neither feasible nor definable, the algorithm gives the two closest concepts called infimum and supremum.
{A,B}* = infimum = { a(b(A*)) a(B*) , b(a(b(A*)) a(B*)) }
{A,B}* = supremum = { a(b(A*)) b(a(B*)) , b(a(b(A*)) b(a(B*)) }
N.B.: the same authors have proposed another method based on a similarity index. Both are implemented in Semana (see FCA)

Minimal rules (Bolc, Cytowski and Stacewicz):

Bolc, L., Cytowski, J. & Stacewicz, P. (1996) O Logice i Wnioskowaniu Przybliżonym (On Logic and Rough Reasoning). Institute of Computer Science, Polish Academy of Sciences, ICS PAS Report 822 (in Polish), 1-54.

This important tool, also based on RST, is developed in in another stack of Semana (see Decision Logic).

References

DUNTSCH I., G. GEDIGA (2000). Rough Set Data Analysis.
KOMOROWSKI J., PAWLAK Z., POLKOWSKI L., SKOWRON A. (1999). Rough sets: A tutorial. In Rough Fuzzy Hybridization. (Pal, S. & Skowron, A. Eds.). Springer-Verlag, p. 3-98.
PAWLAK Z. (1982). Rough sets. International Journal of Information and Computer Science, 11, 341-356.
PAWLAK Z. (1991). Rough sets - Theoretical aspects of reasoning about data, Kluwer Academic.
SAQUER J., J. S. DEOGUN (1999). Formal Rough Concept Analysis, Proc. of 7th International Workshop, RSFDGrC'99, Yamaguchi, New Directions in Rough Sets, Data Mining, and Granular-soft Computing, Lecture Notes in Computer Science. Ed. Zhong, N., Skowron, A., and Ohsuga, S. Springer-Verlag, 1999.
SAQUER J., J. S. DEOGUN (2000), Concept approximations for formal concept analysis, The 8th International Conference on Conceptual Structures (ICCS'2000) (Darmstadt, Germany), vol. II Working with Conceptual Structures.
SAQUER J., J. S. DEOGUN (2001). Concept Approximations Based on Rough Sets and Similarity Measures.International Journal of Applied Mathematics and Computer Science

Apache/1.3.29 Server at celta.paris-sorbonne.fr Port 80