count(*) for SET datatype

Post Reply
michel.weber
Posts: 256
Joined: Sat Oct 08, 2005 12:40 pm

count(*) for SET datatype

Post by michel.weber »

I'm trying to implement something like the 'group by' behaviour of the parametric search interface, it works fine for data of type STRING, but I'm still confused about how the SET datatype works because i don't get the expected results :

I have trouble with results from the following SQL statement when the field is of type SET (Vortex or tsql) :

SELECT U_Publisher AS val, count(*) AS cnt FROM html GROUP BY U_Publisher

Given the following record set :

E:\Temp>tsql -d D:\texis\data1/PortalLiferayS\db1 "SELECT U_Publisher AS val FROM html WHERE U_Publisher LIKE 'pub4'"
Texis Version 05.01.1258488718(20091117) Copyright (c) 1988-2009 Thunderstone EPI
val
------------+
pub3,pub4,pub25,
pub1,pub4,
pub4,pub5,

Iwould expect either of the following results :
val cnt
------------+------------+
pub3,pub4,pub25, 1
pub1,pub4, 1
pub4,pub5, 1

or

val cnt
------------+------------+
pub1, 1
pub3, 1
pub4, 3
pub5, 1
pub25, 1

but i'm getting something like :

E:\Temp>tsql -d D:\texis\data1/PortalLiferayS\db1 "SELECT U_Publisher AS val, count(*) AS cnt FROM html WHERE U_Publisher LIKE 'pub4' GROUP BY U_Publisher"

val cnt
------------+------------+
pub1 1
pub4 1
pub3 1
pub4 1
pub25 1
pub3 1
pub4 1
pub25 1
pub1 1
pub4 2
pub5 1
pub4 1
pub5 1
pub3 1
pub4 1
pub25 1

or this with DISTINCT

E:\Temp>tsql -d D:\texis\data1/PortalLiferayS\db1 "SELECT DISTINCT U_Publisher AS val, count(*) AS cnt FROM html WHERE U_Publisher LIKE 'pub4' GROUP BY U_Publisher"

val cnt
------------+------------+
pub1 1
pub4 1
pub3 1
pub25 1
pub4 2
pub5 1
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

count(*) for SET datatype

Post by Kai »

I'd expect the second of your expected results, with that version of Texis (improvements to GROUP BY with strlst were added a few months prior).

What indexes exist on this table? SELECT * FROM SYSINDEX WHERE TBNAME='html';
michel.weber
Posts: 256
Joined: Sat Oct 08, 2005 12:40 pm

count(*) for SET datatype

Post by michel.weber »

Yes me too i expected the second result

For the custom fields there are oly regular indexes :
SELECT * FROM SYSINDEX WHERE TBNAME='html';
NAME TBNAME FNAME COLLSEQ TYPE NON_UNIQUE FIELDS PARAMS
------------+------------+------------+------------+------------+------------+------------+------------+
xhtmlurl html xhtmlurl A B 00 Url
xhtmlid html xhtmlid D B 01 id-
xhtmlhash html xhtmlhash A B 01 Hash
xhtml_nc html xhtml_nc A B 01 NextCheck
xhtmlvisit html xhtmlvisit D B 01 Visited-
xhtmlcat html xhtmlcat A B 01 Catno
xhtmldepth html xhtmldepth A B 01 ,Depth Url
xh_TiDsKyMtBy_ViMoDpPpCl html xh_TiDsKyMtBy_ViMoDpPpCl A F 01 ,,,,,,Title\Description\Keywords\Meta\Body Visited Modified Depth Pop id CatnoLowest
U_ILanguageCoE html U_ILanguageCoE A B 01 U_LanguageCoE
U_ISubject html U_ISubject A B 01 U_Subject
U_ICoverage html U_ICoverage A B 01 U_Coverage
U_IPublisher html U_IPublisher A B 01 U_Publisher
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

count(*) for SET datatype

Post by Kai »

Try using count(U_Publisher) instead of count(*). The more general field spec sometimes affects the optimizer.
Post Reply