Slowness in Blue Yonder (JDA/RedPrairie)

Junaid Abbasi
12 min readOct 16, 2020

While providing support on BY (formerly JDA) WMS, I have noticed Application slowness to be a prominent problem faced by BY WMS users. There could be many different reasons behind it. However, I am going to discuss some common ones.

  1. Database blocking
  2. Bad query Plan
  3. High CPU usage
  4. High Memory usage
  5. Network slowness

Database blocking:

Generally, database blocking can cause slowness. Though BY WMS has improved on this a lot in the newer versions. However, it can still occur sometimes and has the potential to cause slowness for a distribution site. First, let me explain what database blocking is, what are deadlocks, and how it happens?

Database blocking occurs when one session is holding a resource by putting a lock on it and other sessions are waiting for this.

When two sessions are reading same data there is ideally no reason that one may hold lock on it. However, when one session is performing insert/update or delete it must hold lock to avoid dirty reads. Sometimes, when a session is going to perform a data change task (for example during allocation or pick release) it puts a lock on that row or page so other sessions must wait for the change to complete. If this takes a longer time, it can cause a chain of blocking sessions is created and eventually it blocks multiple database sessions and then user face slowness during operation from Fat Client or Portal. In this case we may need to get the session at top end and check what is it waiting for? Sometimes due to any system issue a session gets stuck and never releases the lock and, in this case, we must kill this session to keep other sessions to continue working.

In SQL server SP_WHO2 stored procedure can be used to check if there are any blocking sessions, but it still required to view multiple blocking sessions and get the source blocking session.

Below script can be used to get source blocking session which is not blocked and is blocking other sessions.

Oracle:

[SELECT s.sid AS sessionId, 
s.serial# AS SerialNumber,
s.status AS sessionStatus,
s.state AS sessionState ,
s.machine AS Server,
s.blocking_session AS Blocker,
s.blocking_session_status AS BlokerStatus,
s.seconds_in_wait,b.sid as blockingSession,
b.serial# AS blockingserialNumber,
b.machine AS BLockingSessionserver
FROM v$session s,
v$session b
WHERE s.blocking_session is not null
AND s.blocking_session <> s.sid
AND s.blocking_session not in (select sid
from v$session
where blocking_session is not null
)
AND b.sid = s.blocking_session
And s.sid> 50
]
/*following query can also be used to get locks on database*/[select sw.seconds_in_wait,
sw.seconds_in_wait / 60 as minutes_waiting,
s1.username blocker_user ,
s1.machine blocker_machine,
s1.sid blocker_sid ,
s1.serial# blocker_serial,
s2.username blocked_user,
s2.machine blocked_machine,
s2.sid blocked_sid,
s2.serial# blocked_serial,
dbo.object_name blocked_object,
s2.status blocked_status,
s1.status blocker_status,
sql1.sql_text blocker_sql,
sql2.sql_text blocked_sql
from gv$lock l1
join gv$session s1
on s1.sid = l1.sid
join gv$lock l2
on l1.id1 = l2.id1
join gv$session s2
on s2.sid = l2.sid
join gv$session_wait sw
on sw.sid = l2.sid
join gv$locked_object lo
on lo.session_id = s1.sid
join dba_objects dbo
on dbo.object_id = lo.object_id
left join v$sql sql1
on sql1.sql_id = s1.sql_id
and sql1.address = s1.sql_address
left join v$sql sql2
on sql2.sql_id = s2.sql_id
and sql2.address = s2.sql_address
where l1.BLOCK in (1, 2)
and l2.request > 0
order by sw.seconds_in_wait
] catch (-1403)
|
if (@? != 0)
{
publish data
where message = 'No DB Blocking Locks Found'
}
else
{
filter data
where moca_filter_level = '2'
}

SQL Server:

[SELECT sp.spid, 
sp.status,
sp.loginame [Login],
sp.hostname,
sp.blocked BlkBy,
sp.cmd Command,
sp.cpu CPUTime,
sp.physical_io DiskIO,
sp.waittime/1000 'Wait Time (Sec)',
sp.login_time LoginTime,
sp.last_batch LastBatch,
sp.program_name
FROM master.dbo.sysprocesses sp
WHERE sp.blocked <> 0
AND sp.blocked <> sp.spid
and blocked not in (select spid
from master.dbo.sysprocesses
where blocked <> 0
)
]

Blocking found!!! What are next steps?

Once blocking is found and system is getting slow then take following steps for further investigation.

· What is the execution behind blocking session?

· Is this occurring on same query multiple times?

· Is the scenario re-creatable?

· Should it be killed?

Get MOCA Command through Console

If above queries returned blocking sessions, then you can also find the thread on console and check which MOCA command is being executed behind this.

Oracle database blocking

For oracle get v$session.module and in SQL Server sysprocesses.program_name are the columns which are to be linked on MOCA console.

Database Connections on JDA Console

In order to get session and MOCA command behind blocking check for that module (in case of oracle) and programe_name (in case of SQL SERVER) on console under “Database Connections” tab. In database section get thread_id and find this thread under “Sessions” tab on console.

Now under Sessions tab check for thread id found against connect in database connections. Session tab will show running SQL, MOCA Command, environment variables like usr_id etc.

JDA Console Sessions

Is this command proclaimed?

Form history you must know if this command was reported as a culprit multiple time. If yes, check the object (table) being locked and check for the indexes on this object, are those setups correctly? (As this is not a simple task so a dba could also be involved to check indexes). However, to keep track of database objects, a DBA should never load indexes directly onto the database server and they must be installed via a rollout during scheduled maintenance window.

Is this re-creatable?

If indexes seem correct and system performance is getting worse, then check if the process is re-creatable on lower system and pass on the blocking session to DBA for further investigation at DB side and to kill the relevant session.

Killing the blocking session

Now it is DBA’s task to investigate the blocking on db. From a user’s perspective the only application he is working on is important and they are not aware of what else is being processed on db concurrently. DBA must check if there any long running processes being executed (e.g. backup job)? Once DBA confirms that the reported blocking session is the culprit and is unable to provide/resolve the main reason which has caused this blocking session to wait then the last thing they may do is to kill the session.

What DBA must know before killing a session?

· Blocking session

· Blocked sessions/objects

· Is this important to kill in time-sensitive situation

· Is the session blocking other objects for more than a reasonable time?

· The correct SID has been taken to be killed

To kill a session in oracle

ALTER SYSTEM KILL SESSION ‘<sid>,<serial#>’ IMMEDIATE; 

To kill a session in SQL server

kill <sid>

Deadlock:

Let’s take an example to explain this term. We have two sessions S1 and S2 both need two resources R1 and R2. S1 came into action and locked R1 and S2 locked R2. S1 is waiting for R2 to complete its task and S2 is waiting for R1 to complete its task. So, both sessions are is waiting for the other to release resource and not releasing its own. This situation is termed as deadlock.

In database when such situation is created then the Database Engine decides which session should be given resources and kills the other session. This sort of situation is generally not handled by a support resource and is generally not a cause of concern on newer databases and the database engine is smart enough to proceed in such a situation.

Deadlock is highly related to db blocking but it has a little difference, and it doesn’t create slowness but in this case one session is to be killed. But if this situation occurs regularly on a specific scenario then the user/developer may need to review the query and change accordingly.

Bad query plan

A query plan is a set of steps that the database management system executes in order to complete the query. Query optimizer determines those steps for each query by applying different algorithms to get optimized query plan or execution plan.

Sometimes a specific process suddenly starts getting performance issues without any code change whereas it was working perfectly few hours ago. This is the main symptom that query plan may have changed. Query optimizer changes the plan as per its algorithm and chooses the best plan but from user’s point of view it is not the perfect plan and cause slowness. Now user/DBA must compare the plans and pin the one which was working perfectly. If a plan is pinned, then query optimizer never changes it.

If floor users are reporting slowness on specific process like picking. Allocation etc. get trace of that process and check for the query consuming more time. Once you found the slow query check for that query’s plan in database. If overall system performance is slow then we may also get topmost slow queries and analyze them.

How to get topmost slow queries:

If you are monitoring system performance and you want to get all running queries and check performance. Below scripts could be used to get currently running queries. Then putting filter on execution time, we may get top n queries and then check for its plan etc.

Oracle:

[select nvl(ses.username, 'ORACLE PROC') || ' (' || ses.sid || ')' USERNAME,
SID,
MACHINE,
REPLACE(SQL.SQL_TEXT, CHR(10), '') STMT,
ltrim(to_char(floor(SES.LAST_CALL_ET / 3600), '09')) || ':' || ltrim(to_char(floor(mod(SES.LAST_CALL_ET, 3600) / 60), '09')) || ':' || ltrim(to_char(mod(SES.LAST_CALL_ET, 60), '09')) RUNT
from V$SESSION SES,
V$SQLtext_with_newlines SQL
where SES.STATUS = 'ACTIVE'
and SES.USERNAME is not null
and SES.SQL_ADDRESS = SQL.ADDRESS
and SES.SQL_HASH_VALUE = SQL.HASH_VALUE
and Ses.AUDSID <> userenv('SESSIONID')
order by runt desc,
1,
sql.piece
]

SQL Server:

[SELECT sqltext.TEXT, 
req.session_id,
req.status,
req.command,
req.cpu_time,
req.total_elapsed_time/1000 seconds,
req.wait_type,
req.blocking_session_id
FROM sys.dm_exec_requests req CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sqltext
where text like '%%'
order by seconds desc
]

How to get execution/query plan?

Oracle:

Once slow queries have found next step is to examine their execution plan. In oracle “v$sql.sql_fulltext” contains the query and ”v$sql.plan_hash_value” is the key to its plan. If there are multiple plans, check for the latest one and Analyze the plan. join “v$sql_plan” on plan_hash_value to get the query plan.

Sql_fultext contains bind variables so you may ignore where clause and check for select and join clause using like operator and then from given rows you may further filter the SQL and get the exact query you are searching for.

Let say we found a SQL getting more than reasonable time which is

select distinct ord.ordnum
from shipment_line
join ord_line
on ord_line.client_id = shipment_line.client_id
and ord_line.ordnum = shipment_line.ordnum
and ord_line.ordlin = shipment_line.ordlin
and ord_line.ordsln = shipment_line.ordsln
and ord_line.wh_id = shipment_line.wh_id
join ord
on ord.client_id = ord_line.client_id
and ord.ordnum = ord_line.ordnum
and ord.wh_id = ord_line.wh_id
where ship_id = :var_ship_id
and shipment_line.wh_id = :var_wh_id
and ordtyp in (:q0)

Search v$sql for this for given query using like operator, also replace all white space characters with wildcard “%”

SQL Server:

In SQL server following query can be used to get execution plan.

[SELECT databases.name, 
dm_exec_sql_text.text AS TSQL_Text,
dm_exec_query_stats.creation_time,
dm_exec_query_stats.execution_count,
dm_exec_query_stats.total_worker_time AS total_cpu_time,
dm_exec_query_stats.total_elapsed_time,
dm_exec_query_stats.total_logical_reads,
dm_exec_query_stats.total_physical_reads,
dm_exec_query_plan.query_plan
FROM sys.dm_exec_query_stats CROSS APPLY sys.dm_exec_sql_text(dm_exec_query_stats.plan_handle) CROSS APPLY sys.dm_exec_query_plan(dm_exec_query_stats.plan_handle)
INNER JOIN sys.databases
ON dm_exec_sql_text.dbid = databases.database_id
WHERE dm_exec_sql_text.text LIKE '%select%from%ord_line%'
]

Query Plan through MOCA:

In MOCA, Component [explain query] can be used to get current execution plan of a query.

query plan MOCA

Analyze query plan

After getting query plan using above techniques, next step is to check whether the plan being used is correct or not? Following are main points to be taken care of while analyzing query plan.

· How frequent this query is being executed?

· Is the plan scanning full table or Index?

Query usage:

So first, check how frequent selected query is being executed. If it has high frequency, then obviously we can’t afford slowness in it. We may need to fix slowness, however if it has very few occurrences then it can be overlooked as it has high maintenance cost compare to its usage.

Following queries can be used to get frequency of a query.

Oracle:

[select sql_id,sql_text 
from dba_hist_sqltext
where sql_id in (select sql_id
from (select s.sql_id,
RANK() OVER (
ORDER BY (max(s.CPU_TIME_TOTAL/s.executions_total)) DESC) cpu_rank,
RANK() OVER (
ORDER BY (max(s.ELAPSED_TIME_TOTAL/s.executions_total)) DESC) elapsed_rank
from dba_hist_sqlstat s,
dba_hist_snapshot sn
where sn.begin_interval_time between sysdate - 1 and sysdate
and sn.snap_id=s.snap_id
and s.executions_total >0
group by s.sql_id
)
where cpu_rank <=100
and elapsed_rank<=100
)
]

SQL Server:

[SELECT TOP 10 databases.name, 
dm_exec_sql_text.text AS TSQL_Text,
dm_exec_query_stats.creation_time,
dm_exec_query_stats.last_execution_time,
dm_exec_query_stats.execution_count,
dm_exec_query_stats.last_elapsed_time,
dm_exec_query_stats.total_worker_time AS total_cpu_time,
dm_exec_query_stats.total_elapsed_time,
dm_exec_query_stats.total_logical_reads,
dm_exec_query_stats.total_physical_reads,
dm_exec_query_stats.total_elapsed_time / dm_exec_query_stats.execution_count execper,
dm_exec_query_plan.query_plan
FROM sys.dm_exec_query_stats CROSS APPLY sys.dm_exec_sql_text(dm_exec_query_stats.plan_handle) CROSS APPLY sys.dm_exec_query_plan(dm_exec_query_stats.plan_handle)
INNER JOIN sys.databases
ON dm_exec_sql_text.dbid = databases.database_id
WHERE dm_exec_sql_text.text LIKE '(@%'
and execution_count > 100
and (
last_elapsed_time > 2000000 /*2 seconds*/
or (dm_exec_query_stats.total_elapsed_time / dm_exec_query_stats.execution_count) > 2000000
)
order by last_execution_time desc
]

Full table scan or index

First let me briefly describe difference between full table scan and index seek. Let’s take an example of a book having over 1000 pages. Our task is to find a word “knowledge” and there is no index page available in book. Imagine how difficult this task is to parse all 1000 pages to find one word.

On the other hand, if indexes are defined at the end of book which took 10 extra pages. In this case, we will just scan last 10 pages and get page number from there and directly approach that page. We have used 10 more pages, but this saves the searching cost.

In database world index is a table containing two columns a key and pointer to its address. But if we have too many indexes on a table first it uses more space and on DML operations (UPDATE/INSERT/DELETE) database update indexes also and hence updating all indexes of a table makes the DML operation slow. If we don’t define any index on a table then it would be same as parsing the whole book.

To know more about Indexing https://stackoverflow.com/questions/1108/how-does-database-indexing-work

In JDA WMS the “dlytrn” table which contains transactions history and generally having has the largest amount of data. This table is used for reporting and so many different columns are accessed again and again and if we don’t use indexes on this table, the relevant list commands for reporting would be very slow.

Let’s assume our one task is to check inventory movement and we use to check this by “lodnum”. If index is defined on “lodnum” then query optimizer will go to this index and it will return our result very quickly using a logarithmic value of total rows in table. Otherwise it will have to scan millions of rows to check for given “lodnum”. Same goes for “prtnum”,” ordnum”,”ship_id” etc. As these columns are frequently used for reporting and audit, we need cluster-index on such columns. There are some more columns which are used rarely for any kind of audit or reporting e.g. “var_nam”, “fr_value”,”to_value”. Defining indexes on these columns can cost more on DML operations than the benefit of reducing searching cost.

High CPU usage

CPU usage Resource Monitor

Another reason for slowness is high CPU usage. Main thing to keep in mind is that if CPU usage is 90% It doesn’t mean it is bad. It is the ideal situation that system is fully utilizing the CPU. But if CPU usage goes to 100% and processes are getting slow then this is required to be investigated and fix. In production systems we can’t afford downtime and investigation may take much time so in this case most of time the process taking high CPU is killed/restarted or sometimes system is bounced to get it normal. This is not the ideal solution but to avoid downtime it can be accomplished.

There are multiple tools available to monitor system performance. Using these tools we may get the usage graph and if we see that CPU usage is getting higher at specific time then check if there are any jobs or tasks scheduled at that time, if yes then re-schedule these tasks to some other time to reduce load on CPU.

High Memory Usage

Memory usage Resource Monitor

The term “Memory Usage” reflects both primary and secondary memories. Sometimes if there is a resource intensive process (like backup job) is running it may take full of RAM and other processes may stuck. To avoid such situation such heavy processes should be scheduled to run in off hours.

Sometimes it occurs that a user starts a trace and forgets to stop it and continue with doing their work. Overtime trace file is getting larger and a time comes when the hard disk reaches 100 percent causing the Application to hang. To avoid this issue, different monitors should be set to check on hardware utilization, if a threshold is reached, email should be sent to relevant resources based on the level of consumption.

In older versions memory leak also creates slowness. To avoid this issue system can scheduled to be restarted weekly or at-least monthly at off hours.

Network slowness

Network slowness is also a reason for slowness. In JDA WMS sometimes RF user in specific area reports slowness issues and other areas are working fine. System also doesn’t show any slowness then network or WIFI signals strength could be the reason for this specific area slowness.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Junaid Abbasi
Junaid Abbasi

Written by Junaid Abbasi

An experienced BY WMS consultant who enjoys troubleshooting and likes to work on new ideas. reach out to me on LinkedIn: https://www.linkedin.com/in/jabbasi77/

Responses (1)

Write a response