DBA - Day1
Perfect! I’ll give you hands-on-style answers with commands, output samples, and explanation, just like a real-world DBA would use. Let’s begin with the enhanced version of your earlier question:
---
Question 1: How do you troubleshoot high CPU usage in Oracle 19c?
Step-by-Step Approach (Hands-on)
---
Step 1: OS-Level Check using top or vmstat
Command:
top -n 1
Sample Output:
%Cpu(s): 90.5 us, 1.5 sy, 0.0 ni, 8.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
us (user): high value here (e.g., 90%) indicates application/SQL consuming CPU.
If id (idle) is very low and us is high = DB or app is the cause.
Alternative Command (better for deeper view):
vmstat 2 5
Sample Output:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
3 0 0 20000 15000 500000 0 0 0 1 200 300 85 5 10 0 0
r = run queue. If r > no. of CPUs, system is under pressure.
us/sy: High = CPU hogging.
id: Low = busy CPU.
wa: High = I/O waits (not CPU).
---
Step 2: Database-Level Session Analysis
Query:
SELECT sid, serial#, username, status, program, sql_id, event
FROM v$session
WHERE status = 'ACTIVE'
AND username IS NOT NULL;
Check Top SQL by CPU:
SELECT sql_id, cpu_time/1000000 cpu_sec, executions, sql_text
FROM v$sql
ORDER BY cpu_time DESC
FETCH FIRST 5 ROWS ONLY;
---
Step 3: AWR Report Analysis
Run:
@$ORACLE_HOME/rdbms/admin/awrrpt.sql
Important Sections to Review:
1. Top 5 Timed Events – shows if CPU is the bottleneck:
Event % Total Call Time
----------------------------- ------------------
CPU time 78.45%
db file sequential read 12.32%
→ High CPU time indicates SQLs are inefficient.
2. SQL ordered by CPU Time
Focus on top SQLs — capture SQL_ID, tune them.
3. Instance Efficiency %
Buffer Cache Hit, Library Hit Ratio — if low, tuning needed.
4. Load Profile
DB Time(s): 10,000
DB CPU(s): 8,000
High DB CPU compared to DB Time = CPU bottleneck.
---
Step 4: Action Plan
Identify top SQLs — tune them (add index, rewrite, collect stats).
If session_cached_cursors or shared_pool is small, tune memory.
Patch optimizer bugs if query plan is changing often.
Enable adaptive stats if not already (_optimizer_adaptive_features).
---
Summary (What to tell in interview):
> “I start with OS tools like top and vmstat to see if the CPU is under pressure. Then inside Oracle, I use v$session, v$sql and generate AWR reports to find top SQLs by CPU. I check ‘Top Timed Events’ and ‘SQL ordered by CPU time’. Based on this, I tune SQLs, memory parameters, or optimizer settings.”
---
Would you like me to proceed similarly with:
RMAN restore?
Data Guard troubleshooting?
AWR analysis full example?
Great — here’s the deep-dive on the next three topics, in the same hands-on style:
---
1. RMAN Restore (Duplicate Database Example)
Scenario: Clone PROD to UAT using RMAN DUPLICATE
---
Step-by-Step:
Source: PROD
Target: UAT
Backup location: /u01/backup/prod/
---
Step 1: Create init file on UAT
db_name=PROD
db_unique_name=UAT
db_file_name_convert='/u01/oradata/PROD/','/u01/oradata/UAT/'
log_file_name_convert='/u01/oradata/PROD/','/u01/oradata/UAT/'
---
Step 2: Start UAT in NOMOUNT
export ORACLE_SID=UAT
sqlplus / as sysdba
STARTUP NOMOUNT PFILE='/u01/app/oracle/dbs/initUAT.ora';
---
Step 3: RMAN Duplicate
rman target sys@PROD auxiliary sys@UAT
DUPLICATE TARGET DATABASE TO UAT
BACKUP LOCATION '/u01/backup/prod/'
NOFILENAMECHECK;
---
Sample Output:
Finished Duplicate Db at 22-APR-25
RMAN-03002: command completed successfully
---
Interview Notes (What to Say):
> I use RMAN DUPLICATE for UAT/Dev clones. I start target in NOMOUNT with a PFILE, configure file name conversion, and run DUPLICATE TARGET DATABASE. I ensure backups are accessible and redo logs are managed properly post-clone.
---
2. Data Guard Troubleshooting (Archive Gap / Sync Issue)
Common Issue: Archive log not shipped or applied to standby
---
Step 1: Check Gaps
-- On Primary
SELECT DEST_ID, STATUS, ERROR FROM V$ARCHIVE_DEST;
-- On Standby
SELECT SEQUENCE#, APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE# DESC;
---
Step 2: Manually Transfer Archive Logs
scp /u01/arch/1_34567_111.arc oracle@standby:/u01/arch/
---
Step 3: Register on Standby
ALTER DATABASE REGISTER LOGFILE '/u01/arch/1_34567_111.arc';
---
Step 4: If far behind – Roll Forward Standby
On primary:
rman target /
BACKUP INCREMENTAL FROM SCN 123456789 DATABASE FORMAT '/tmp/inc_%U.bkp';
On standby:
rman target /
CATALOG START WITH '/tmp/';
RECOVER DATABASE NOREDO;
---
Interview Notes (What to Say):
> I monitor log gaps using v$archived_log and if standby lags, I copy missing logs and register them. For large gaps, I use incremental roll forward via RMAN to sync standby. I also check Data Guard broker if configured.
---
3. AWR Report Walkthrough
---
Run AWR:
sqlplus / as sysdba
@?/rdbms/admin/awrrpt.sql
Choose:
Begin and end snapshot
HTML/Text report
---
Sections to Review:
---
A. Top Timed Events
Event %Time Waits Avg Wait
-------------------------- ------ ------- ---------
CPU time 78.45
db file sequential read 12.22 9,000 0.2ms
Interpretation:
High CPU time → inefficient SQL
High “sequential read” → index access
---
B. Load Profile
DB Time: 10,000s DB CPU: 9,000s
Redos: 50MB/s Executes: 12,000
Interpretation:
DB CPU ≈ DB Time = CPU-bound
High executes = possible parsing issues
---
C. Instance Efficiency
Buffer Cache Hit %: 92.56
Parse %: 30.0
Library Cache Hit %: 89.5
Note:
Parse% > 10 → shared pool issues
Library Cache Hit < 95 → SQL not reused
---
D. SQL Ordered by Elapsed/CPU
SQL ID Elapsed Time (s) CPU (s) Execs SQL Text
------------ ----------------- ------- ----- ------------------
abc12h3kjh2 5000 4500 100 SELECT ... JOIN ...
Tune these SQLs. Get plans via:
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('abc12h3kjh2'));
---
What to Say:
> I focus on Top Timed Events to find bottlenecks, Load Profile for system pressure, and SQL ordered by CPU/Elapsed Time to isolate problematic queries. I also validate buffer cache and parse stats for inefficiencies.
---
Would you like me to create a Word or PDF version of all these examples as a reference document?
Or shall I continue next with:
Oracle RAC node crash handling
Block corruption detection & repair
Data Pump export/import real-time example
GoldenGate real-time setup/monitoring
Let me know what you’d like next!
*****
Oracle DBA Hands-On Interview Preparation
1. Troubleshooting High CPU Usage
Commands:
top -n 1
vmstat 2 5
SQL Queries:
SELECT sid, sql_id, event FROM v$session WHERE status='ACTIVE';
SELECT sql_id, cpu_time, sql_text FROM v$sql ORDER BY cpu_time DESC FETCH FIRST 5 ROWS ONLY;
AWR Focus:
- Top Timed Events: Look for "CPU time"
- SQL by CPU Time
- Load Profile - DB Time vs DB CPU
Say in interview:
I correlate OS and DB metrics. If top shows CPU spikes, I use AWR to identify high CPU SQLs, tune them or apply fixes.
2. RMAN Restore (Duplicate Example)
Prepare init file with file name conversions, start in NOMOUNT, run:
DUPLICATE TARGET DATABASE TO UAT BACKUP LOCATION '/u01/backup/prod/' NOFILENAMECHECK;
Say in interview:
I use RMAN DUPLICATE for UAT clones. I configure file mappings and restore from backup with minimal downtime.
3. Data Guard Troubleshooting
Check Gaps:
SELECT SEQUENCE#, APPLIED FROM V$ARCHIVED_LOG;
Manual Steps:
scp /u01/arch/1_34567_111.arc standby:/u01/arch/
ALTER DATABASE REGISTER LOGFILE ...
Say in interview:
I monitor archive sync via v$archived_log and use RMAN roll-forward if needed.
4. AWR Report Analysis
Key Sections:
- Top Timed Events (CPU or waits)
- Load Profile (DB CPU vs DB Time)
- SQL by Elapsed/CPU Time
- Instance Efficiency
Say in interview:
I review AWR for top waits, high-cost SQL, a
nd inefficient memory usage and tune accordingly.
Comments
Post a Comment