[reposted with correction in step 1]
For the problem in <fgrieu-CAECB2.19552928112005@news13-e.proxad.net>
ggr@qualcomm.com (Gregory G Rose) proposed
> the following pseudo-python
>
> # first make the matrix "as positive as possible"
> for each column:
> if number of (-1) > number of (+1):
> negate all entries of column
> # now sort it negative to top left
> for i = 0 to max(n,m)-1:
> sort the columns
> sort the rows
> assert that the top i rows and the left i columns have not changed
> # now hash
>
> I think the loop invariant is sufficient to prove that
> it does produce a canonical form.
I think it does not; one issue is that for columns with
an equal number of -1 and +1, the first step fails to
disembiguate polarity, which will result in different
raw order afterwards.
There is another and most profound problem with orders
in raws influencing how columns get sorted, and vice
versa, and I fail to see how this is bound to converge
to a unique state per C-equivalence class, and conjecture
it does not.
My best partial algorithm so far:
The general plan is that
- we initialy put alls rows in a single set, and will
find ways to split this set into subsets, with an order
in the subsets that is invariant under C-equivalence;
each of the new subsets will replace the original subset
in the ordered collection of subsets of alls the rows,
without a re-sort.
- we'll have a "polarized" flag for each colums, set when
we have disambiguated the column's polarity in a manner
invariant under C-equivalence; initialy the "polarized"
flag is set only for columns consisting entirely of
zeroes.
- we initialy split the set of columns into the subset
of columns that are polarized, and the subset of columns
that are not; we will proceed with ordered subdivision
as for rows.
We apply 6 rules in circular sequence until no progress
is made.
0) "Raws split by quietness"
Within each subset of raws with more than one raw,
we order and split the raws according to the
lexicographic orders of the string obtained by replacing
each current subset of columns by the number of zeroes
in the intersection of the raw and the columns subset.
1) "Raws split by balance" [revised]
Within each subset of raws with more than one raw,
we order and split the raws according to the
lexicographic orders of the string obtained by replacing
each current subset of *polarized* columns by
the sum of the numbers in the intersection; subsets of
non-polarized columns are ignored.
note: per initial conditions then rule 3, all columns
in each columns subset have the same polarized status.
2) "Column polarization"
We polarize a column when, in any subset of rows,
the sum of its element is non-zero, in which case we
adjust its polarity to make the sum positive in the first
subset, and mark the column as polarized.
3) "Column split by polarization"
Within each subset of columns with more than one column,
we order and split the columns according to their
polarized/non-polarized status.
4) "Column split by quietness"
Within each subset of columns with more than one column,
we order and split the columns according to the
lexicographic orders of the string obtained by replacing
each current subset of raws by the number of zeroes
in the intersection of the column and the raws subset.
5) "Column split by balance"
Within each subset of columns with more than one column,
we order and split the columns according to the
lexicographic orders of the string obtained by replacing
each current subset of raws by
- the sum of the numbers in the intersection, if the
column is already polarized;
- the absolute value of this sum, if the column is
not yet polarized.
note: per rule 3, all columns in the columns subset
have the same polarized/not polarized status.
When the above rules stop to make any progress, we are
left with matrixes at the intersections of raws/columns
subsets. If all these submatrixes contaìn the same value
(including but not limited to 1x1 matrixes), and if all
columns are polarized, the algorithm has suceeded.
If not, I guess there are ways to make further progress,
but fail to make a systematic description.
François Grieu
Received on Sat Dec 3 04:20:27 2005