DirectShow Filters Development Part 3: Transform Filters
Introduction
Transform filters are probably the most interesting pieces of the DirectShow puzzle. They encapsulate complex image and video processing algorithms. From a filter development point of view, they are not harder to implement than others; however, they do require some additional coding and method overrides. As with rendering and source filters, transform filters also have base classes from which you should inherit when implementing your custom work.
Transform filters have at least two pins, one input pin and one output pin. Transform filters are divided into two categories - copy-transform filters and in-place transform filters. As their name implies, a copy-transform filter takes the data from the input pin, transforms it, and writes the outcome to the output pin, whereas an in-place filter performs its work on the input sample and passes it on to the output filter.
DirectShow provides three base classes for writing transform filters:
CTransformFilter
- base class for copy-transform filtersCTransInPlaceFilter
- base class for in-place transformsCVideoTransfromFilter
- designed for video decoding and has built-in quality control management for dropping frames in case of flooding
I will cover the first two classes in this article: the CTransInPlace
descendent will be used for a text overlay filter, and CTransformFilter
will be used for a JPEG/JPEG2000 encoder.
Before we continue, you should take a look at part 1 of this series as the filter development prerequisites, filter registration, and filter debugging are all the same.
Text Overlay Filter
A text overlay filter adds some user defined text to each and every frame that goes through the filter. It can be used for displaying subtitles or a logo. Adding text to the video frame does not change its media subtype or format, therefore an in-place transform suits perfectly. I will be using GDI+ for overlays, as it provides a convenient API for creating in-place bitmaps and drawing characters on a bitmap.
using namespace Gdiplus;
using namespace std;
class CTextOverlay : public CTransInPlaceFilter, public ITextAdditor
{
public:
DECLARE_IUNKNOWN;
CTextOverlay(LPUNKNOWN pUnk, HRESULT *phr);
virtual ~CTextOverlay(void);
virtual HRESULT CheckInputType(const CMediaType* mtIn);
virtual HRESULT SetMediaType(PIN_DIRECTION direction, const CMediaType *pmt);
virtual HRESULT Transform(IMediaSample *pSample);
static CUnknown *WINAPI CreateInstance(LPUNKNOWN pUnk, HRESULT *phr);
STDMETHODIMP NonDelegatingQueryInterface(REFIID riid, void ** ppv);
STDMETHODIMP AddTextOverlay(WCHAR* text, DWORD id, RECT position,
COLORREF color = RGB(255, 255, 255), float fontSize = 20);
STDMETHODIMP Clear(void);
STDMETHODIMP Remove(DWORD id);
private:
ULONG_PTR m_gdiplusToken;
VIDEOINFOHEADER m_videoInfo;
PixelFormat m_pixFmt;
int m_stride;
map<DWORD, Overlay*> m_overlays;
};
The only pure virtual method is the Transform
method and it must be implemented in your class. In addition, I have also overridden the CheckInputType
called for each media type during the pin connection negotiation. Since a transform filter has two pins at least, SetMediaType
has the direction argument which indicates whether the connection is performed on the input or the output pin. You may want to save both the input and output video headers. In this case, I only need the input video header since it is exactly the same as the output:
HRESULT CTextOverlay::SetMediaType(PIN_DIRECTION direction, const CMediaType *pmt)
{
if(direction == PINDIR_INPUT)
{
VIDEOINFOHEADER* pvih = (VIDEOINFOHEADER*)pmt->pbFormat;
m_videoInfo = *pvih;
HRESULT hr = GetPixleFormat(m_videoInfo.bmiHeader.biBitCount, &m_pixFmt);
if(FAILED(hr))
{
return hr;
}
BITMAPINFOHEADER bih = m_videoInfo.bmiHeader;
m_stride = bih.biBitCount / 8 * bih.biWidth;
}
return S_OK;
}
The filter accepts RGB only formats with 15, 16, 24, and 32 bits per pixel, and using the GDI+ Bitmap
class, it is possible to create in-place bitmap objects without any buffer copy. After that, I create a graphics object from that bitmap and call the Graphics::DrawString
method to draw the user defined text on the bitmap:
HRESULT CTextOverlay::Transform(IMediaSample *pSample)
{
CAutoLock lock(m_pLock);
BYTE* pBuffer = NULL;
Status s = Ok;
map<DWORD, Overlay*>::iterator it;
HRESULT hr = pSample->GetPointer(&pBuffer);
if(FAILED(hr))
{
return hr;
}
BITMAPINFOHEADER bih = m_videoInfo.bmiHeader;
Bitmap bmp(bih.biWidth, bih.biHeight, m_stride, m_pixFmt, pBuffer);
Graphics g(&bmp);
for ( it = m_overlays.begin() ; it != m_overlays.end(); it++ )
{
Overlay* over = (*it).second;
SolidBrush brush(over->color);
Font font(FontFamily::GenericSerif(), over->fontSize);
s = g.DrawString(over->text, -1, &font, over->pos,
StringFormat::GenericDefault(), &brush);
if(s != Ok)
{
TCHAR msg[100];
wsprintf(L"Failed to draw text : %s", over->text);
::OutputDebugString(msg);
}
}
return S_OK;
}
Using the ITextAditor
interface, you can add a text overlay with ID, remove them by ID, or remove all. Each overlay contains the text, the bounding rectangle, color, and font size:
DECLARE_INTERFACE_(ITextAdditor, IUnknown)
{
STDMETHOD(AddTextOverlay)(WCHAR* text, DWORD id, RECT position,
COLORREF color, float fontSize) PURE;
STDMETHOD(Clear)(void) PURE;
STDMETHOD(Remove)(DWORD id) PURE;
};
Overlay objects are stored in a map in a thread safe manner so you can freely add and remove overlays during playback. Thread-safety in the DirectShow framework is achieved using Critical Sections and the CAutoLock
class which is usually declared in the beginning of the method, and when going out of scope at the end of the method - the Critical Section is released.
JPEG / JPEG2000 Encoder
It took me a while to decide what type of video encoding to implement, and eventually I decided to make a simple intra-frame encoder - each video frame is encoded with no reference to the previous or next frame. This type of encoding is easier to implement than inter-frame encoding standards like MPEG4 or H264, but suffers from larger stream throughput since there is much redundant pixel information between neighbor frames. I also created a base class for other intra-frame encoder types, and you can easily swap the implementation by inheriting from CBaseCompressor
and updating the Factory method which creates the concrete implementations:
struct CBaseCompressor
{
virtual HRESULT Init(BITMAPINFOHEADER* pBih) PURE;
virtual HRESULT Compress(BYTE* pInput, DWORD inputSize, BYTE* pOutput,
DWORD* outputSize) PURE;
virtual HRESULT SetQuality(BYTE quality) PURE;
virtual HRESULT GetMediaSubTypeAndCompression(GUID* mediaSubType,
DWORD* compression) PURE;
};
By default, the encoding standard is JPEG, and it is based on a code I found here on CodeProject. Using the IJ2KEncoder::SetEncoderType
method, you can change the implementation to the JPEG2000 encoding standard which is based on the OpenJpeg library. Please note that if one of the filter's pins is connected, you cannot change the encoder implementation, so it is best to set the desired encoding algorithm right after filter creation.
JPEG2000 and Media Sub Types
When using a JPEG compressor, DirectShow provides a built-in media sub type called MEDIASUBTYPE_MJPG
, and it is declared in the uuids.h file. Regarding JPEG2000, I could not find any appropriate GUID, so I created one using the following macro definition:
DEFINE_GUID( MEDIASUBTYPE_MJ2C, MAKEFOURCC('M', 'J', '2', 'C'),
0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa,
0x00, 0x38, 0x9b, 0x71);
When using the BITMAPINFOHEADER
structure for compressed images, you have to set the biCompression
field to MAKEFOURCC('M', 'J', '2', 'C'
). This way, the filter can connect to JPEG2000 decoders, like this one.
MJ2C means a JPEG2000 code stream, and it is actually a motion JPEG2000 definition where each frame consists of compressed image data. Another standard is J2K, and it is usually used for still image encoding and also contains headers.
Although JPEG2000 provides better compression ratios and better image quality, especially at lower bit rates, it is more CPU intensive than JPEG and hence less suitable for large resolution videos. During a research I made on JPEG2000 implementations, I found a project called CUJ2K - a JPEG2000 implementation based on CUDA - a GPU based API developed by NVIDIA. Since this library was designed for still images located on the hard drive, it uses the command line to pass the source and destination paths for the images. To make use of it for in-memory buffers, it required some additional work, so I decided to go with OpenJpeg; however, it is worth looking at if you need better performance.
Filter Implementation
To implement a transform filter, you have to implement six methods:
Transform
- receives input and output media samples.CheckInputType
- checks whether the input pin can connect to an upstream filter.CheckTransform
- checks whether a transformation is possible between input and output media types.DecideBufferSixe
- sets the memory buffer size for the output media samples.GetMediaType
- returns the media type used to connect the output pin with the downstream filter.SetMediaType
- called when the input and output pins are successfully connected.
class CJ2kCompressor : public CTransformFilter, public IJ2KEncoder
{
public:
DECLARE_IUNKNOWN;
CJ2kCompressor(LPUNKNOWN pUnk, HRESULT *phr);
virtual ~CJ2kCompressor(void);
// CTransfromFilter overrides
virtual HRESULT Transform(IMediaSample * pIn, IMediaSample *pOut);
virtual HRESULT CheckInputType(const CMediaType* mtIn);
virtual HRESULT CheckTransform(const CMediaType* mtIn,
const CMediaType* mtOut);
virtual HRESULT DecideBufferSize(IMemAllocator * pAlloc,
ALLOCATOR_PROPERTIES *pProp);
virtual HRESULT GetMediaType(int iPosition, CMediaType *pMediaType);
virtual HRESULT SetMediaType(PIN_DIRECTION direction, const CMediaType *pmt);
static CUnknown * WINAPI CreateInstance(LPUNKNOWN pUnk, HRESULT *pHr);
STDMETHODIMP NonDelegatingQueryInterface(REFIID riid, void ** ppv);
// IJ2KEncoder
STDMETHODIMP SetQuality(BYTE quality);
STDMETHODIMP SetEncoderType(EncoderType encoderType);
private:
VIDEOINFOHEADER m_VihIn;
VIDEOINFOHEADER m_VihOut;
CBaseCompressor* m_encoder;
};
The Transform
method implementation is pretty straightforward: I get the buffer pointers from the input and output media samples and then pass them to the CBaseCompressor
implementation. After that, I set the actual output media sample size and set the sync point to true
since every frame is a reference frame:
HRESULT CJ2kCompressor::Transform(IMediaSample* pIn, IMediaSample* pOut)
{
HRESULT hr = S_OK;
BYTE *pBufIn, *pBufOut;
long sizeIn;
DWORD sizeOut;
hr = pIn->GetPointer(&pBufIn);
if(FAILED(hr))
{
return hr;
}
sizeIn = pIn->GetActualDataLength();
hr = pOut->GetPointer(&pBufOut);
if(FAILED(hr))
{
return hr;
}
hr = m_encoder->Compress(pBufIn, sizeIn, pBufOut, &sizeOut);
if(FAILED(hr))
{
return hr;
}
hr = pOut->SetActualDataLength(sizeOut);
if(FAILED(hr))
{
return hr;
}
hr = pOut->SetSyncPoint(TRUE);
return hr;
}
Filter Registration
Since this filter is a video encoder, it should be registered in the video compressor filters category, and this is done using the IFilterMapper
object:
STDAPI RegisterFilters( BOOL bRegister )
{
HRESULT hr = NOERROR;
WCHAR achFileName[MAX_PATH];
char achTemp[MAX_PATH];
ASSERT(g_hInst != 0);
if( 0 == GetModuleFileNameA(g_hInst, achTemp, sizeof(achTemp)))
{
return AmHresultFromWin32(GetLastError());
}
MultiByteToWideChar(CP_ACP, 0L, achTemp, lstrlenA(achTemp) + 1,
achFileName, NUMELMS(achFileName));
hr = CoInitialize(0);
if(bRegister)
{
hr = AMovieSetupRegisterServer(CLSID_Jpeg2000Encoder,
J2K_FILTER_NAME, achFileName, L"Both", L"InprocServer32");
}
if( SUCCEEDED(hr) )
{
IFilterMapper2 *fm = 0;
hr = CoCreateInstance( CLSID_FilterMapper2, NULL,
CLSCTX_INPROC_SERVER, IID_IFilterMapper2, (void **)&fm);
if( SUCCEEDED(hr) )
{
if(bRegister)
{
IMoniker *pMoniker = 0;
REGFILTER2 rf2;
rf2.dwVersion = 1;
rf2.dwMerit = MERIT_DO_NOT_USE;
rf2.cPins = 2;
rf2.rgPins = psudPins;
hr = fm->RegisterFilter(CLSID_Jpeg2000Encoder, J2K_FILTER_NAME,
&pMoniker, &CLSID_VideoCompressorCategory, NULL, &rf2);
}
else
{
hr = fm->UnregisterFilter(&CLSID_VideoCompressorCategory, 0,
CLSID_Jpeg2000Encoder);
}
}
if(fm)
fm->Release();
}
if( SUCCEEDED(hr) && !bRegister )
{
hr = AMovieSetupUnregisterServer( CLSID_Jpeg2000Encoder );
}
CoFreeUnusedLibraries();
CoUninitialize();
return hr;
}
STDAPI DllRegisterServer()
{
return RegisterFilters(TRUE);
}
STDAPI DllUnregisterServer()
{
return RegisterFilters(FALSE);
}
References
Post Comment
xNeOeb Really enjoyed this blog post.Really thank you! Will read on
li4ra0 Your style is unique compared to other people I ave read stuff from. Many thanks for posting when you ave got the opportunity, Guess I will just bookmark this blog.
nYPU5z Of course, what a fantastic site and illuminating posts, I surely will bookmark your site.All the Best!
B3k7MW Im grateful for the post.Really looking forward to read more. Want more.
6NmVrL This website was how do I say it? Relevant!! Finally I ave found something that helped me. Thanks!
aPQqTO vаАабТТdeo or a piаАааАТturаА аЂа or t?o to l?аА аБТk for people excited
BqyXu7 I truly appreciate this post. I ave been looking everywhere for this! Thank God I found it on Bing. You ave made my day! Thx again..
X6Ikh8 Pas si sAаЂаr si ce qui est dit sera mis en application.
HpQUxq I value the blog article.Really looking forward to read more. Really Great.
BlPzFd Some genuinely quality articles on this site, bookmarked.
APM6wv There is obviously a lot to identify about this. I feel you made certain nice points in features also.
iR0okk This is one awesome blog article.Thanks Again. Great.
w2TmiJ It'аs actually a great and useful piece of information. I am glad that you shared this useful info with us. Please keep us up to date like this. Thank you for sharing.
Really enjoyed this article.Thanks Again. Cool.
EByobu Major thanks for the blog article.Much thanks again.
npwSZu This is really interesting, You are a very skilled blogger. I ave joined your rss feed and look forward to seeking more of your excellent post. Also, I ave shared your site in my social networks!
6wGijG Really enjoyed this blog article.Really thank you! Will read on
IiDWxz Some really select articles on this web site , saved to bookmarks.
lI6hPN
7Mva7f You ave made some really good points there. I checked on the internet to learn more about the issue and found most individuals will go along with your views on this site.
h7Yuh8 Regards for this wonderful post, I am glad I discovered this web site on yahoo.
VEgNub Looking forward to reading more. Great post.Really looking forward to read more. Want more.
uUfMtH Its like you read my mind! You appear to know so much
eGTdCq Hey, thanks for the article post.Much thanks again. Really Cool.
t1pJue Well I really liked reading it. This tip offered by you is very helpful for proper planning.
WgEC5C I have been reading out a few of your stories and i can state pretty clever stuff. I will make sure to bookmark your website.
9CkyFG You should take part in a contest for among the finest blogs on the web. I will advocate this site!
KqklGI My brother recommended I might like this website. He was totally right. This post truly made my day. You can not imagine just how much time I had spent for this information! Thanks!
funIdH Thank you for your article.Much thanks again. Want more.
PZhWfi Great blog article. Fantastic.
8wCOBA Major thanks for the blog post. Great.
Whn2O3 I really liked your blog article.Thanks Again. Great.
jO2iuw Muchos Gracias for your article post.Really looking forward to read more. Much obliged.
cYyMCw Thanks for the post.Really thank you! Awesome.
NcMq4p Thanks again for the article post.Much thanks again. Keep writing.
CwaHz4 Really appreciate you sharing this blog article.Really looking forward to read more. Awesome.
bVPElf Looking forward to reading more. Great blog.Really looking forward to read more.
rgMm8o Thanks for the article post. Cool.
fvDCNx Appreciate you sharing, great article post.Much thanks again. Much obliged.
ScVW8M Enjoyed every bit of your article post. Much obliged.
I value the article post. Awesome.