Showing posts with label related. Show all posts
Showing posts with label related. Show all posts

Friday, March 23, 2012

Many to one Select

I have 2 tables related as:

T1.KEY, T1.FIELD1, T1.FIELD2

T2.KEY, T2.FIELDA, T2.FIELDB
T2.KEY, T2.FIELDA, T2.FIELDB

T1.KEY = T2.KEY

I want to return a SELECT as:

T1.FIELD1, T1.FIELD2, T2.FIELDA, T2.FIELDB, T2.FIELDA, T2.FIELDA

The second table, in some cases but not all, has multiple rows for each
row in T1. I want to return a single row with all values for T2.FEILDA
and B.

--
jeffvh
----------------------
jeffvh's Profile: http://www.dbtalk.net/m47
View this thread: http://www.dbtalk.net/t293766jeffvh (jeffvh.251ikz@.no-mx.forums.yourdomain.com.au) writes:
> I have 2 tables related as:
> T1.KEY, T1.FIELD1, T1.FIELD2
> T2.KEY, T2.FIELDA, T2.FIELDB
> T2.KEY, T2.FIELDA, T2.FIELDB
> T1.KEY = T2.KEY
> I want to return a SELECT as:
> T1.FIELD1, T1.FIELD2, T2.FIELDA, T2.FIELDB, T2.FIELDA, T2.FIELDA
> The second table, in some cases but not all, has multiple rows for each
> row in T1. I want to return a single row with all values for T2.FEILDA
> and B.

So for T1.Key = 8 there are six rows in T2, there should be 14 columns,
two for T1 and seven for T2?

I'm afraid that is not easily doable.

The result of a query is alwys a table, and a table has a fixed number
of columns; it cannot be jagged.

It still possible to define a query that has maximum of columns needed,
but that number must be known in advance. You cannot write a query
which produces 16 columns on one execution, and 20 columns next time.

Furthermore, we need rules to say which row goes into which column.

So in the general case, this is very messy, and may be easier to sort
this out client-side.

However, if there are further conditions that you know, but didn't tell us,
it might be easier. The general recommendation is that you post:

o CREATE TABLE statements for your tables.
o INSERT statements with sample data.
o The desired result given the sample.
o A short narrative of the busines problem.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||this is called a "cross tab" report. very hard to do in sql.
do some research, and you can find some examples, but they all require
custom sql.

many tables vs one big table

Dear experts,

If I have a number of related groups of data (e.g., stock prices for
different stocks) is it better to put them into many tables (e.g., one
table per stock) or into one big table (e.g., a single table where one
column is the stock id other columns are the price and other data)?

The single table approach with a clustered primary key including the
stock id and other information that is different for each stock seems
cleaner to me. But I worry that having a single table could hurt
preformance when the database gets very large. The clustered primary
key should make queries on the single table fast, but I'm worried about
inserts. According to my vague understanding of a clustered primary
key, records are physically stored on disk based on the primary key. So
if I insert a record for a given stock does the database have to
physicall "move down" all the records that are below that or does it do
something smarter?

To summarize:

1. Should I use a single table or many tables?

2. If using a single table, how are inserts handled by the DB to
prevent inserts from getting slower and slower as more items are added
to the DB?

Thanks in advance for your help and advice.Use a single table, not many tables. Use a multi-column key that
includes the stock. When a new stock comes along it should be a user
transaction to add a row to the Stocks master table, not a database
administrator's job to add tables. The database design should be
stable, not in a constant state of change.

Performance of INSERTS does not change much between inserting to a
table with 10000 rows or 10000000 rows.

Roy Harvey
Beacon Falls, CT

On 2 Jan 2007 08:57:21 -0800, "Emin" <emin.shopper@.gmail.comwrote:

Quote:

Originally Posted by

>Dear experts,
>
>If I have a number of related groups of data (e.g., stock prices for
>different stocks) is it better to put them into many tables (e.g., one
>table per stock) or into one big table (e.g., a single table where one
>column is the stock id other columns are the price and other data)?
>
>The single table approach with a clustered primary key including the
>stock id and other information that is different for each stock seems
>cleaner to me. But I worry that having a single table could hurt
>preformance when the database gets very large. The clustered primary
>key should make queries on the single table fast, but I'm worried about
>inserts. According to my vague understanding of a clustered primary
>key, records are physically stored on disk based on the primary key. So
>if I insert a record for a given stock does the database have to
>physicall "move down" all the records that are below that or does it do
>something smarter?
>
>To summarize:
>
1. Should I use a single table or many tables?
>
2. If using a single table, how are inserts handled by the DB to
>prevent inserts from getting slower and slower as more items are added
>to the DB?
>
>Thanks in advance for your help and advice.

|||Emin (emin.shopper@.gmail.com) writes:

Quote:

Originally Posted by

If I have a number of related groups of data (e.g., stock prices for
different stocks) is it better to put them into many tables (e.g., one
table per stock) or into one big table (e.g., a single table where one
column is the stock id other columns are the price and other data)?


There are two reasons why you would have more than one table:

1) Information is so different for different (groups of) stocks,
that you get different set of columns.

2) The table is really big. Then you would might make into a partioned
view or table, based on condition. But that would not be one table
per stock id.

So, in the logical design, that's one table.

Quote:

Originally Posted by

The single table approach with a clustered primary key including the
stock id and other information that is different for each stock seems
cleaner to me. But I worry that having a single table could hurt
preformance when the database gets very large. The clustered primary
key should make queries on the single table fast, but I'm worried about
inserts. According to my vague understanding of a clustered primary
key, records are physically stored on disk based on the primary key. So
if I insert a record for a given stock does the database have to
physicall "move down" all the records that are below that or does it do
something smarter?


If the PK is monotonically ascending new rows are just added at the
end, all continues smoothly. If the PK is not ascending, but for instance
a random GUID, then you will get a lots of page splits, and that
takes extra power. But the answer in this case is that you should
not cluster on the GUID, but on something else. It's perfectly possible
to have the clustered index on some other column(s).

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||On 2 Jan 2007 08:57:21 -0800, Emin wrote:

(snip)

Quote:

Originally Posted by

>if I insert a record for a given stock does the database have to
>physicall "move down" all the records that are below that or does it do
>something smarter?


Hi Emin,

Roy answered most of your questions, but was friendly enough to leave
this bit for me to comment on.

Yes, it does something smarter. Rows are *NOT* physically stored in the
order imposed by the clustered index. The clustered index imposes a
*logical* ordering on the data, which is implemented by pointers that
form a logical chain of database pages.

If a row in inserted, SQL Server first navigates the clustered index to
find the correct location (database page). Then, there are two
possibilities:

1. There's still space left on the page. The row is added and the insert
is finished.

2. There's not enough free space on the page. A page split occurs: half
of the data on the page is moved to a newly allocated page and pointers
are updated to reflect the location of the new page in the chain. After
that, the new row is inserted.

If your database has to process *LOTS* of inserts (in the order of
thousands per second or more), it makes sense to define your clustered
index such that new rows are always inserted at the logical end of the
pointer chain. In that case, page splits will never happen.
For lower amounts of inserts, the overhead of a page split is
insignificant.

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis|||Dear Roy, Erland, and Hugo,

Thank you very much for your excellent responses. It sounds like a
single logical table is the way to go since the frequency of inserts is
not very high. If necessary, where would I get more information about
low level details like page splits (e.g., how to control the page size,
how to determine when and how many page splits are occuring, etc.)?

Thanks again,
-Emin

On Jan 2, 5:53 pm, Hugo Kornelis <h...@.perFact.REMOVETHIS.info.INVALID>
wrote:

Quote:

Originally Posted by

On 2 Jan 2007 08:57:21 -0800, Emin wrote:
>
(snip)
>

Quote:

Originally Posted by

if I insert a record for a given stock does the database have to
physicall "move down" all the records that are below that or does it do
something smarter?Hi Emin,


>
Roy answered most of your questions, but was friendly enough to leave
this bit for me to comment on.
>
Yes, it does something smarter. Rows are *NOT* physically stored in the
order imposed by the clustered index. The clustered index imposes a
*logical* ordering on the data, which is implemented by pointers that
form a logical chain of database pages.
>
If a row in inserted, SQL Server first navigates the clustered index to
find the correct location (database page). Then, there are two
possibilities:
>
1. There's still space left on the page. The row is added and the insert
is finished.
>
2. There's not enough free space on the page. A page split occurs: half
of the data on the page is moved to a newly allocated page and pointers
are updated to reflect the location of the new page in the chain. After
that, the new row is inserted.
>
If your database has to process *LOTS* of inserts (in the order of
thousands per second or more), it makes sense to define your clustered
index such that new rows are always inserted at the logical end of the
pointer chain. In that case, page splits will never happen.
For lower amounts of inserts, the overhead of a page split is
insignificant.
>
--
Hugo Kornelis, SQL Server MVP
My SQL Server blog:http://sqlblog.com/blogs/hugo_kornelis

|||One more question, is there a process similar to defragmentation that I
should run periodically to collect split pages together?

Thanks again,
-Emin

On Jan 3, 1:42 pm, "Emin" <emin.shop...@.gmail.comwrote:

Quote:

Originally Posted by

Dear Roy, Erland, and Hugo,
>
Thank you very much for your excellent responses. It sounds like a
single logical table is the way to go since the frequency of inserts is
not very high. If necessary, where would I get more information about
low level details like page splits (e.g., how to control the page size,
how to determine when and how many page splits are occuring, etc.)?
>
Thanks again,
-Emin
>
On Jan 2, 5:53 pm, Hugo Kornelis <h...@.perFact.REMOVETHIS.info.INVALID>
wrote:
>

Quote:

Originally Posted by

On 2 Jan 2007 08:57:21 -0800, Emin wrote:


>

Quote:

Originally Posted by

(snip)


>

Quote:

Originally Posted by

Quote:

Originally Posted by

>if I insert a record for a given stock does the database have to
>physicall "move down" all the records that are below that or does it do
>something smarter?Hi Emin,


>

Quote:

Originally Posted by

Roy answered most of your questions, but was friendly enough to leave
this bit for me to comment on.


>

Quote:

Originally Posted by

Yes, it does something smarter. Rows are *NOT* physically stored in the
order imposed by the clustered index. The clustered index imposes a
*logical* ordering on the data, which is implemented by pointers that
form a logical chain of database pages.


>

Quote:

Originally Posted by

If a row in inserted, SQL Server first navigates the clustered index to
find the correct location (database page). Then, there are two
possibilities:


>

Quote:

Originally Posted by

1. There's still space left on the page. The row is added and the insert
is finished.


>

Quote:

Originally Posted by

2. There's not enough free space on the page. A page split occurs: half
of the data on the page is moved to a newly allocated page and pointers
are updated to reflect the location of the new page in the chain. After
that, the new row is inserted.


>

Quote:

Originally Posted by

If your database has to process *LOTS* of inserts (in the order of
thousands per second or more), it makes sense to define your clustered
index such that new rows are always inserted at the logical end of the
pointer chain. In that case, page splits will never happen.
For lower amounts of inserts, the overhead of a page split is
insignificant.


>

Quote:

Originally Posted by

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog:http://sqlblog.com/blogs/hugo_kornelis

|||Emin (emin.shopper@.gmail.com) writes:

Quote:

Originally Posted by

One more question, is there a process similar to defragmentation that I
should run periodically to collect split pages together?


Yes, DBCC DBREINDEX on SQL 2000. The preferred syntax on SQL 2005 is
ALTER INDEX WITH REBUILD or something like that.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||On 3 Jan 2007 10:42:46 -0800, Emin wrote:

Quote:

Originally Posted by

If necessary, where would I get more information about
>low level details like page splits (e.g., how to control the page size,
>how to determine when and how many page splits are occuring, etc.)?


Hi Emin,

For that kind of information, the only serious resource are the books
written by Kalen Delaney. Quite advanced stuff, definitely not entry
level, but that's unavoidable if you want to know about page splits.

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis

Monday, March 12, 2012

manipulating data before grouping & displaying

I got data with recursive data. Additionally every entry is asssigned to a specific role. But not every related child & parent doesn't have to have the same role. Usually, every child should be displayed underneath its parent, but when I group on the role, the entries get split up.Like:

Role x:
Parent 1
Child 1.1
Child 1.2
Parent 2
Role y:
Parent 3
Child 2.1
Child 2.2
Parent 4

I think you get the picture. But It should look like:

Role x:
Parent 1
Child 1.1
Child 1.2
Parent 2
Child 2.1
Child 2.2
Role y:
Parent 3
Parent 4

The easiest way I thought of, was, to manipulate the data of the child entries in the role column, so it matches the same role as their parent. Anyone got any idea of how to accomplish such a thing?

write a recursive sql cursor. Create a temp table with the a new column. For each new level add a space to the new column. So the child would get a " " and the grandchildren would get a bigger space " " and so fourth. Make sure the cursor puts the elements in the correct order as in your example above. The cursor will transfer your dataset and will load the values in the temp table. Then when you print your tree you can do something like.

newcolumn.value & node.value