Wednesday, August 24, 2016

End to end request processing time


In a simplistic view, in a typical online system, this is where one needs to check to know any slowness




Browser rendering time - if the page size too big or too complex with java scripts and style sheets
then the rendering time could take time..
# of static content being used on the web page has an impact on the overall load time. And, if they are not cacheable, the round trips even from a CDN could have an impact on the load time
Total download time for a page depends on how big is the response content and how many susbrequests are triggered part of the page and the network time and entire server side processing time
Page load time is the time by when user sees the page.. so above all have impact on it.

coming to network, it is important to know what is the path the request is traversing from client to the server. Is it taking longest path via CDN or how the addresses being resolved over public internet and any proxy being used etc., the network delay and packet loss are important factors to keep an eye on.

coming to server side, the request processing time depends on many factors.. it depends on underlying infrastructure, resources, architecture, code logic implementations etc.,
but there are few check points to look at to break it down..
checking at http server layer tells the variation between end page load time and total server time. This helps identify any network delays.
checking the difference between http server to app server times tells if there is any delay in the middle layers like authentication routing..
on the app server side, it could spend time in many places.. the routing can happen to many servers or even to external systems. using runtime instrumentation tools, it is possible to break down the time spent in pure code, wait times due to synchronous code blocking, gc pause times, time spent in reading from sockets/wirte to while interacting with DB or in the calls to other servers while making remote calls like rjvm to EJBs or service calls etc., by breaking this way, each underlying activity and the delays can be identified

Each of the above ones are not more than just an index to an ocean of tuneable metrics that each underlying technology modules contains. However, where to look at and what to tune is the key.



Application performance - odd bits



we do normally warm up the systems and concerned about the how it performs under load but will it meet customer expectation or improve customer experience or avoid frustrating scenarios ?
how about the 'first time' and idle case performances ?

did you ever check what is the single user response times and request processing times on server side ? if it is not performing any good for single user then it will not do for multiple. The base performance for single user is what needs to tuned first

what about first time cases ? we can not ask users of the system to hang on until system warms up ! then who will use it first :)  so, analysing cold case system performance is important. it is important to know what happens on first access across the layers in cold cases i.e. after a system restart, app servers or entire VMs or DB etc., and what needs to be tweaked to get better first time performance.

how about sleeping systems ! not all systems work round the clock atleast not all days in a week.. so did you ever check what is the resource usage of idle systems ? there could be some unexpected code path executions might happen even under no load case. So it is important to know the resources usage under no load as they might also run all the time.

how about apps performance under a very slow network access or networks with high packet loss. how reliable are the systems..

what happens when infrastructure fails ? how many users can still continue doing what ever they are doing with out any interruption and feel the same performance of the app while a DR process kicks in..

how about the systems with long running sessions.. users may not logout for long time and it is required to keep the sessions so long and what is the impact !

how the applications handle when majority of end users do not logout but just closes the browser ! how to handle the memory usage in those cases..



Saturday, August 20, 2016

Java EE - EJB


Enterprise Java Beans - EJB

EJBs are the Java EE server side components which implement app's business logic. They normally be deployed in EJB containers provided by the app servers. Implementing EJBs can provide scalability to the application and better handling of security and transactions. EJBs can also implement webservices.

Types of EJB:
Session - To implement user actions
- stateful: maintains a state for the client and this bean can not be shared.
- stateless: does not maintain state so can be allocated to any clients. They are reusable. So this
        pool will be less compared to stateful.
- singletone session beans: have only one instance for the whole time.  They can get initiated when the app starts. Although they act as stateless but there is no pool because of single existence

  Message Driven Beans (MDB): for listening to the messages either from queues or JMS
(if someone read about entity beans, they are now part of persistence API..(referring to Java EE7)

Implementation is simple.. infact, the annotations made it so..lets says to write a simple stateless EJB, just annotate the class with @Stateless and implement the business logic.. and to call this EJB, a servlet can do ..ofcourse, you need to annotate with @WebServlet(urlPatterns="/") the url pattern says the context root. And, as usual extend the class with HttpServlet. And, to access the EJB, just annotate with @EJB and then the declare the EJB instance.

However, in a typical EJB implementation, there could be all type of session beans - stateless, stateless bean implementing a service, stateful bean accessed remotely etc.,

A remote interface is required for the beans which allow remote access. This remote business interface defines all the business methods of a bean and annotated with Remote from javax.ejb pacakge and it gets implemented by the session bean. A session bean can be an end point for a web service.

A stateful session bean can have methods annotated with Remove which can be invoked by client to remove the instance.

In the case of singleton bean, the concurrent access from the clients can be controlled in two ways - container managed or bean managed by annotating accordingly. And, the methods must be annotated with the locktype i.e. read or write so that concurrent access can allowed or provided with a synchronous mechanism respectively..

For stateless session beans to implement service end points, they must be annotated with @Webservice and the business methods that are exposed must be annotated with @Webmethods. They can also implement async methods so that clients no need to wait for response from long running methods.

Coming to EJB pools - in weblogic, there is an element called max-beans-in-free-pool in weblogic-ejb-jar.xml. This determines how many EJBs must be made available in free pool. max-beans-in-pool will put a cap on the pool limit. For MDBs, the container will create as many instances required based on the size limited to max-beans-in-free-pool. Default MDB threads are 16 but this can be changed by having custom queue or workmanagers

Sunday, August 14, 2016

Oracle VM


Virtualization is a technique to share the hardware resources among multiple systems or users to achieve optimal usage of resources and reducing costs.

Although virtualization is a generic one conceptually, lets talk on server virtualization. This means a bunch of HW resources like CPUs, Memory, Disks, ports etc., are shared among multiple OSs either of same type or multiple type. So, to achieve this we need someone or something to manage the underlying HW and above running guests (OSs). This 'manager' is what is called a 'Hypervisor'.

There are couple of types of Hypervisors
Native or bare metal hypervisor - this is the software which directly runs on host's hardware to control the hardware and monitor the guests OS. so imagine this something that mediates between guest OS and underlying hardware. Example of
such implementation are Oracle VM, VMware EXXi Xen, Microsoft Hyper-V

The other hypervisor is made to run within a traditional operating system and then guest OSs can run on top of it. Example Oracle VirtualBox (which can be installed on an PC where windows is the base OS but virtual box can then host another guest OS like linux..


Oracle VM Server:

This can be installed on X86 instruction set based platforms with Xen hypervisor (GPU licensed) or on SPARC platforms (which will have its own hypervisor).
In general, the above implementation has their own firmware/hardware, a hypervisor and then a super domain/vm which controls the resource allocation to other guest VMs (also called domains or simply guests)

So, simply Oracle VM server is a collection of hardware (CPU, Mem, Network, IO etc.,), hypervisor (for managing underlying baremetal i.e. the hardware), domains (the VMs with thier own set of OS except Dom0 which a complete linux kernel and manages all the other Domains).

Lets explore some interesting things related to Oracle VM

CPU capacity:
how to determine the cpu capacity on a vm server
xm info is the command to use. for example, as shown below, the number of cpus are 72 which are ideally the threads. There are 2 nodes, 18 cores per socket and 2 threads per core
i.e. 2 * 2 * 18 = 72 threads (0 to 71 total, 0-35 on sock1, 36-71 on sock2)

nr_cpus                : 72
nr_nodes               : 2
cores_per_socket       : 18
threads_per_core       : 2

The cpu topology can be viewed by using the commnad xenpm get-cpu-topology
CPU     core    socket  node
CPU0     0       0       0
CPU1     0       0       0
CPU2     1       0       0
CPU3     1       0       0
..
xm info also gives the high level vm details like what bit it supports, what instruction set (like intel x86), number of real cpus, number of nodes, number of sockets, number of threads per core, cpu frequency,
memory, pagesize etc.,
In an hyperthreaded model, each core will run 2 threads instead of one. and this would have counter effects but could improve efficiency..


vCPUs
virtual cpus are the cpus that are assigned to a guest/domu i.e. a virtual machine which runs on a domu can be assigned 10 CPUs which are considered as virtual cpus and the actual bindings to real cpu depends on how they are configured. For example below, vm1 is a virtual machine with the id=1, has 3 vCPUs which are in bind state and mapped to CPUs 3,6 7. This vm1 is configured to have the cpu affinity as 2-35 which is first socket on a 2 socket 72 core machine. so, since there is no absolute binding, the mapping can change in runtime and depends on the availability,
the vcpus can be mapped to any of the real cpus in the range 2-35.

xm vcpu-list
Name  ID  VCPU   CPU State   Time(s) CPU Affinity
vm1   1     0     3   -b-   5354.1 2-35
vm1   1     1     6   -b-   2312.4 2-35
vm1   1     2     7   -b-   2337.8 2-35

you can pin the CPUs for guest vms runtime but to change any affinity to dom0 requires a reboot.. and dom0 always takes the top priority.
And, it is always good to monitor the real cpu usage from the vm server to check how in an  oversubscribed case, the busy vms on the same socket could impact each other..

JNDI - Java Naming and Directory Interface




JNDI - Java Naming and Directory Interface

what is JNDI - its a naming service
why is it needed - in a distributed enterprise application, there are multiple resources like DB pools or business components like EJBs deployed on the Java EE containers and they need a way to locate. JNDI serves that purpose.

Applications can use annotations to locate the resource. Like datasources which are nothing but database resources, provides connection to database. when the application code refers to a
 datasource and invokes JDBC API to getconnection, it gets a physical connection. In case connection pooling is implemented then it gets a handle to pooled connection object instead of direct physical connection.
 These connections need to be closed and when closed will go back to the pool. The pool of database connections will give better performance and better connection handling mechanism.

Similarly JNDI mapping can be done to other services like JMS, LDAP etc.,

Below shown are some of the resources on Glassfish server and the JNDI mapping.








Monday, August 8, 2016

Oracle database performance


Lets explore Oracle database performance aspects on a high level..

some of the key terms to know about:

SGA: Shared Global Area - is basically a collection of memory units or structures used by all the processes on a db instance.
PGA: Program Global Area- is a memory region specific to a process (server process or bg process).
Buffer Cache: is basically a buffer to keep data blocks read from data files. The buffer cache is shared across the users.
Shared pool: basically contains the program data like parsed SQLs, PL/SQL code, data dictionaries etc., and this is accessed almost in every DB operation.

And lets see some of the interesting views
gv$process contains details on the currently active processes which are either on CPU or on latchwait or in spinning on a latch.. also contains PGA details for this process
gv$sgastat contains details on the system global area (SGA). For each pool - shared/large/java/stream pools
gv$session contains details for each current session. data from this view is sampled every sec and put into V$ACTIVE_SESSION_HISTORY. From 11G Rel 2 onwards, each individual req can be traced with the help of ECID.
gv$pgastat contains details on PGA usage

we can take a periodic snapshots on the above tables to analyze further..
example: CREATE SNAPSHOT snapshotonprocess AS SELECT * FROM gv$process
and then the processes pga can be calculated.. like select inst_id, count(*) cnt, Round(sum(PGA_USED_MEM) / 1024 / 1024 / 1024, 2) from snapshotonprocess group by inst_id order by inst_id
- adding a criteria like 'background is not null' will give pga stats for background processes

similarly, we can query sga stats, shared pools, shared pool usage, buffer cache usage and active sessions etc.,

Another good place to look at for session performance and to point out the slow SQLs or event waits is by querying V$ACTIVE_SESSION_HISTORY.
V$ACTIVE_SESSION_HISTORY contains sampled session activity in the database. samples are taken every sec. So, it is possible to calculate howmuch time spent on DB  and on which queries for a end user request by tracing the ECIDs in this view.

Lets look at awesome reports by Oracle database on performance of the database.

AWRs- A great performance report on DB workloads. It contains information about DB time, system resource usage, waits or other events that could impact performance, SQL statistics like long running SQLs or resource intensive sqls or
SQLs with high buffer gets.
we can check how the buffer cache and shared pool size changed from the beginning to the end of the snapshots.
logical reads(the more the better), physical reads(the less the better), hardparses(the more over a warmedup system indicates the plans are not good perhaps and might need to gather stats or something attached to the queries in runtime), rollbacks etc.,
latches(a short lived serialization methods that oracle uses on shared structures) efficiency metrics, top 5 foreground events, CPU and memory stats are good place to start with if there is a overall db performance hit.
If the performance is specific to SQLs then other areas to look at in AWRs are related SQL performance - # of execs, time taken per exec, cpu used, buffer gets, hard parses etc., And, up on identifying the SQLs (best way is to match the ECIDs for the frontend initiated SQLs),
SQLHC is next step to analyze further on the historical performance and the to analyze the SQL execution plan, indexes, bind variables/conditions etc.,

To have a better insight, keep a good baseline AWR in the system so that any future snapshot can be compared against it..

To create a awr snap:
EXEC DBMS_WORKLOAD_REPOSITORY.create_snapshot;
select snap_id from dba_hist_snapshot order by snap_id asc;

To generate AWR: select output from table(dbms_workload_repository.awr_report_html(,,,));

To declare the basline:
BEGIN   DBMS_WORKLOAD_REPOSITORY.create_baseline (     start_snap_id => <###>,     end_snap_id   => <###>,   baseline_name => 'baseline'); END;
 /

To compare AWRs:
@$ORACLE_HOME/rdbms/admin/awrddrpi.sql
or
select * from TABLE(DBMS_WORKLOAD_REPOSITORY.awr_diff_report_html(,,,,,,,));


Other good reports to look at:
ADDM report - an Automatic Db Disgnostic Monitoring report.. helps identifying the issues in an Active Workload Repository